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Preface 





From Volume I: Advanced Problem Solving with Maple™: A 
First Course 


The study of problem solving is essential for anyone who desires to use applied 
mathematics to solve real-world problems. We present problem-solving topics 
using the computer algebra system Maple™ for solving mathematical equa- 
tions, creating models and simulations, as well as obtaining plots that help us 
perform our analyses. We present cogent applications of applied mathematics, 
demonstrate an effective use of a computational tool to assist in doing the 
mathematics, provide discussions of the results obtained using Maple, and 
stimulate thought and analysis of additional applications. This book serves as 
either an introductory course or capstone course, depending on the chapters 
covered, to start to prepare students to apply the mathematical modeling pro- 
cess by formulating, building, solving, analyzing, and criticizing mathematical 
models. It is intended for a first course at the sophomore or junior level for 
applied mathematics or operations research majors that introduces these stu- 
dents to mathematical topics that they will revisit within their major. This 
text can also be used for a beginning graduate-level course or as a capstone 
experience for mathematics majors as well as mathematics education majors, 
as modeling has a much bigger role in secondary education with the newest 
National Council of Teachers of Mathematics (NCTM) standards. We also 
introduce many additional mathematics topics that students may study more 
in depth later in their majors. 

Although calculus (either engineering or business) is the prerequisite mate- 
rial, many sections and chapters, especially in Volume II, require multivariable 
calculus. In addition, the use of linear algebra is required in some chapters. 
For students without the necessary background, these chapters can be omit- 
ted for a specific course. We realize that there are more chapters in this text 
than could ever be covered within one semester. The increased number of top- 
ics and chapters provides flexibility for designing a course appropriate to the 
background of your students. 
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Goals and Orientation 


This course bridges the study of mathematics topics and the applications of 
mathematics to various fields of mathematics, science, and engineering. This 
text affords the student an early opportunity to see how assumptions drive 
the models, as well as an opportunity to put the mathematical modeling pro- 
cess together. The student investigates real-world problems from a variety of 
disciplines such as mathematics, operations research, engineering, computer 
science, business, management, biology, physics, and chemistry. This book pro- 
vides introductory material to the entire modeling process. Students will find 
themselves applying the process and enhancing their problem-solving capa- 
bilities, becoming competent, confident problem solvers for the twenty-first 
century. Students are introduced to the following facets of problem solving: 


e CREATIVE PROBLEM SOLVING. Students learn the problem-solving pro- 
cess by identifying the problem to be solved, making assumptions and 
collecting data, proposing a model (or building a model), testing their 
assumptions, refining the model as necessary, fitting the model to the data 
if appropriate, and analyzing the mathematical structure of the model to 
appraise the sensitivity of the results when the assumptions are not strictly 
met. 


e PROBLEM ANALYSIS. Given a model, students will learn to work backward 
to uncover the assumptions, assess how well those assumptions fit the 
scenario, and estimate the sensitivity of the results when the assumptions 
are not strictly met. 


e PROBLEM RESEARCH. The students investigate a specific area to gain 
understanding of behavior and to learn how to apply or extend what has 
already been created or developed to new scenarios. 








Course Content 


We introduce problem solving early. Volume I, Advanced Problem Solving with 
Maple™: A First Course, is a typical applied mathematics or introduction to 
operations research course with topics including ODES, mathematical pro- 
gramming, data fitting with regression, probabilistic problem solving, and 
simulation. This text, Volume II, contains discrete dynamical systems, both 
constrained and unconstrained optimization, linear systems, advanced regres- 
sion, game theory, and multi-attribute decision making. 
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Organization of Text 


Volume I covers introductory topics. Chapter 1 of Volume I is repeated in 
Volume II, introducing Maple and its basic command structure as well as 
introducing the problem-solving process. Because the book uses Maple as the 
tool in mathematical modeling, the chapter provides the foundation or cor- 
nerstone of using technology in the modeling process. In Volume I, Chapter 2 
introduces ordinary differential equations, and Chapter 3 covers systems of 
ordinary differential equations. Chapter 4 covers linear, integer, and mixed 
integer programming as well as the Simplex method. Chapter 5 covers model 
fitting, concentrating on regression methods to fit data. Chapter 6 covers sta- 
tistical and probabilistic problem solving; Chapter 7 extends these ideas into 
Monte Carlo simulations. Scenarios are developed within the scope of the 
problem-solving process. Student thought and exploration are required. 

Volume II covers more advanced topics. Chapter 1, the introduction to 
problem solving and to Maple, is repeated. Chapter 2 covers discrete dynami- 
cal systems, and is complementary to Chapters 2 and 3 of Volume I. Chapters 
3 and 4 cover optimization, both constrained and unconstrained, in single- 
variable and multivariable topics. Chapter 5 deals with solving problems from 
engineering, economics, and chemistry with linear systems. Chapter 6 contin- 
ues with regression but introduces more advanced topics such as non-linear 
regression, logistic regression, and Poisson regression. Chapter 7 covers game 
theory and relies heavily on linear and non-linear programming methods. 
Chapter 8 completes Volume II by discussing multi-attribute or multi-criteria 
decision making with methods such as Data Envelopment Analysis (DEA), 
Simple Additive Weighting (SAW), Analytic Hierarchy Process (AHP), and 
Technique of Order Preference by Similarity to the Ideal Solution (TOPSIS), 
and finishes with investigating methods of choosing weightings. 





Student Projects 


Student projects form the backbone of this course. In each project, students 
apply the mathematical modeling process and the mathematical tools they 
have learned. Each chapter and many sections have collections of student 
projects. We have seen significant student growth over the course of a semester 
in project reports from their first project to the final submission. Student 
projects take time to apply the modeling process, so we typically do not assign 
more than one project to a student. Most of these projects are designed to be 
group projects with two or three students working together, although they can 
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be done as individual projects with strong students. COMAP’s Mathematical 
Contest in Modeling! provides a rich source of further significant projects. 





Technology 


Technology is fundamental to serious mathematical modeling. We chose the 
computer algebra system Maple as our platform for this text; any technology 
could be used—from graphing and symbolic calculators to spreadsheets. 





Emphasis on Numerical Approximations 


Numerical solutions techniques are used for solving dynamical systems, for 
explicative modeling with some numerical analysis approaches, and in opti- 
mization search procedures. These are the methods most easily employed in 
iterative and recursive formulas. Early on, the student is exposed to numerical 
techniques. We present numerical procedures as iterative and algorithmic. 








Focus on Algorithms 


All algorithms are provided with step-by-step formats to aid students in learn- 
ing to do mathematical modeling with these methods and Maple. Examples 
follow the summary to illustrate an algorithm’s use and application. Through- 
out, we emphasize the process and interpretation, not the rote use of formulas. 





Problem Solving and Applications 


Each chapter includes examples of models, real-world applications, and prob- 
lems and projects. Problems are modeled, formulated, and solved within the 
scenario of the application. These models and applications play an important 
role in student growth in working in today’s complex world. 





See https: //www.comap.com/undergraduate/contests/matrix/index.html 
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ee 
Exercises 


Exercises are provided at the end of each section and chapter so the student 
can practice the solution techniques and work with the mathematical concepts 
discussed. Review problems are given at the end of each chapter, some of which 
combine elements from several chapter sections. Projects are also provided at 
the end of each chapter to enhance student understanding of the concepts and 
their application to real-world problems. 








Computer Usage 


Maple is the computer algebra system used for this text. Tutorial labs are 
widely available for learning Maple’s syntax and command structure. Our 
emphasis is on providing the student with the ability to use Maple to assist 
with mathematical modeling. We illustrate graphing in both two and three 
dimensions. We provide Maple packages PSM and PSMv2 containing pro- 
grams, functions, and data to use with each volume. The packages are freely 
available from the Maple Cloud (https://maple.cloud). 
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Introduction to Problem Solving and Maple 











Objectives: 
(1) Understand the nature of problem solving. 
(2) Understand the use of Maple commands. 


(3) Understand the Maple Applications Center and its uses. 





1.1 Problem Solving 


What do we mean by problem solving? We interpret this as having a real 
problem whose understanding and solution requires quantitative analysis and 
one or more solution techniques using mathematics. To put into context, we 
say we need a well-defined problem. After we have a well-defined problem, we 
must brainstorm variables and assumptions that might impact the problem. 
We build or select a known model and choose a solution technique or combi- 
nations of techniques to obtain an answer. We solve and perform sensitivity 
analysis. We interpret all results, implement, and if necessary, refine the entire 
process. 

In many ways, this process is very similar to the mathematical 
modeling processes described in other texts: Albright [A2011], Gior- 
dano et al. [GFH2013], and Meerscheart [M2007] to name a few. Readers may 
want to examine these texts for a more detailed approach. As a co-author of 
the Giordano text, my approach is most similar to the approach we describe 
in that text. 

There are four- and five-step processes for problem solving. We present 
and describe a simple five-step method. 


2 Introduction to Problem Solving and Maple 


Step 1. Define and understand the problem. 


Step 2. Develop strategies to solve the problem. This includes a problem- 
solving formulation including a methodology to obtain a solution. 
If data is available, examine the data, plot it, and look for patterns. 


Step 3. Solve the problem formulated in Step 2. 


Step 4. Perform a self-reflection of your process. You want to make sure 
the solution answers the problem from Step 1. You also want to 
ensure the results pass the “commonsense” test. If not go back to 
Step 2 and reformulate the strategy. 


Step 5. If necessary, extend the problem. 


We will not concentrate on the modeling portion but on the selection of 
the model and the solution technique processes including the use of technology 
in the solution process. 

One key point is that our results must pass the “commonsense” test. For 
example, we were conducting spring-mass experiments in a classroom on the 
3rd floor of our mathematics and science building. The simple purpose was to 
investigate Hooke’s Law. The springs were small and the weights varied from 
a fraction to about 50 grams. After the experiments, we asked the students to 
calculate the stretch of their spring if it were attached to a seat, and they sat 
on the seat. Every student found an answer, but none said the spring would 
most likely break long before it stretched that far. 

Let’s preview a problem we will see in Volumes I and II. We have data 
for time (¢) and an index (y) from [0,100]. Our plot shows a negative lin- 
ear trend. We compute the correlation which is —0.94, and is interpreted 
as a strong negative linear relationship. We use linear regression to build a 
regression equation which has some very good diagnostics, but one question- 
able diagnostic from the residual plot. The main goal is predicting the future, 
which is why the problem is being solved in the first place. The answer for 
y comes out negative which is not a possible answer for y. So we continue 
our problem solving and correct the residual plot issue by adding a quadratic 
term to the regression equation. Again, our diagnostics are all excellent this 
time. We attempt to use the model to predict, but our answer does not pass 
the commonsense test as it is too large. A simple plot shows that for the time 
value in the future we are on the increasing past of the quadratic polynomial. 
If we cannot use our regression equation then our work is useless. Now, we 
continue on the nonlinear regression and use an exponential function to fit 
our data. Finally, not only are all the diagnostics excellent, but our use of the 
new regression equation passes the commonsense test. 

We also believe that in the twenty-first century, technology is a key element 
in all problem solving. Technology does not tell you what to do, but its use 
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provides insights and the ability to check out possibilities. In this book our 
technology of choice is the computer algebra system, Maple™.! 





1.2 Introduction to Maple 


Maple is a symbolic computation system or computer algebra system (CAS) 
that manipulates information in a symbolic or algebraic manner. You can 
use these symbolic capabilities to obtain exact, analytical solutions to many 
mathematical problems. Maple also provides numerical estimates to whatever 
precision is desired when exact solutions do not exist. 

Maple 2019 and higher is different from previous versions of Maple. With 
Maple 2019 you can create profession quality documents, presentations, and 
custom computational tools. You can access the power of the Maple compu- 
tational engine through a variety of interfaces: standard worksheet, classical 
worksheet, command line version, graphing calculator (Windows only), or 
Maple applications. Although you type in the commands in a very similar 
manner as in previous versions of Maple, the statement appears in a “pretty 
print” format on the screen; that is, the statement appears more like typeset 
mathematics. 


Standard Worksheet 


This is a full-featured graphical user interface offering features that help to 
create documents that show all assumptions, the calculations, and any margin 
of error in your results. You can even hide the computations to focus on 
problem setup and final results. 


Classic Worksheet 


The basic worksheet environment works best for older computers with limited 
memory. 


Command-Line Version 


Command-line interface, without graphical user interface features, is used for 
solving very large, complex problems or batch processing. 





‘Maple is a trademark of Waterloo Maple, Inc. 
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Maplesoft™ Graphing Calculator 


The graphical interface to the Maple computational engine allows you to per- 
form simple computations and create customizable, zoomable graphs. The 
Graphing Calculator is only available in the Windows version. 


Maple Applications 


The graphical user interface containing windows, textbox regions, and other 
visual features gives you point-and-click access to the power of Maple. It allows 
you to perform calculations and plot functions without using the worksheet 
or command-line interfaces. 

Maple’s extensive mathematical functionality is most easily accessed 
through all these interfaces. Previous older versions relied on its advanced 
worksheet-based graphical interface. A worksheet is a flexible document for 
exploring mathematical ideas or mathematical alternatives and even creating 
technical reports. 

Experimental mathematical modeling, a natural stepping stone to statisti- 
cal analysis, has an obvious coupling with computers, which can quickly solve 
equations, and plot and display data to assist in model test and evaluation. 
The software computer algebra system, Maple, is a powerful tool to assist in 
this process. When dealing with real-world problems, data requirements can be 
immense. When evaluating immigration trends, for example, and the political, 
social, and economic effects of these trends, thousands of data points are used; 
in some cases, millions of data points. Such problems cannot be analyzed by 
hand, effectively or efficiently. The manipulation required to plot, curve fit, 
and statistically analyze with goodness-of-fit techniques, cannot feasibly be 
done without the assistance of a computer software system. 

The Maple system is easy to learn and can be applied in many mathe- 
matical applications. While these demonstrate its versatility, Maple is also 
an extremely powerful software package. Maple provides over 5000 built-in 
definitions and mathematical functions, covering every mathematical inter- 
est: calculus, differential equations, linear algebra, statistics, and group the- 
ory to mention only a few. The statistical package reduces many standard 
time-consuming statistical questions into one-step solutions, including mean, 
median, percentile, kurtosis, moments, variance, standard deviation, and so 
forth. There are many references for Maple, and a short list would include: 


e Maple Quick Reference 
e Maple Flight Manual 
e Maple Language Reference Manual 


e Maple Library Reference Manual 
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e Maple “What’s New” Release Notes 
e Maple “Getting Started” tutorials and worksheets 


This chapter presents a quick review of some basic Maple commands. It is 
not intended to be a self-contained tutorial for Maple, but will however provide 
a quick review of the basics prior to more sophisticated commands in the mod- 
eling chapters. There are many good references for those new to Maple. Addi- 
tionally, there is a collection of self-contained tutorials accessed by clicking the 
“Getting Started” button on Maple’s opening screen. See Figure 1.1. 





1.3 The Structure of Maple 


Maple is an example of a computer algebra system (CAS). It is composed of 
thousands of commands, to execute operations in algebra, calculus, differential 
equations, discrete mathematics, geometry, linear algebra, numerical analysis, 
linear programming, statistics, and graphing. It has been logically designed 
to minimize storage allocation, while remaining user friendly. Maple allows 
a user to solve and evaluate complicated equations and calculations, analyti- 
cally or numerically, such as optimization problems, least square solutions to 
equations, and solving equations that involve special functions. 


vlogs vs] BYU =SB= t 


Welcome to Maple! 





a| New to Maple? 
T Here’s a quick introduction to help you get started 





New Document 


E” 


New Worksheet 





How do | choose? 








After you've had a chance to play with Maple for a bit, the Maple 
Fundamentals Guide is a good next step. 





Looking for More? 

Visit the Maple Portal for examples, tutorials, manuals, user forums, 
and more. 

Discover what's new in Maple 2020. 


Do not show this page 





Did you know? 
You can use your phone to enter math into Maple using the Maple Companion app. 





FIGURE 1.1: Maple 2020 Opening Screen 


The “New to Maple?” section in the opening screen will lead to videos, 
tutorials, and links to more information as seen in Figure 1.2. 
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More Online Student Help Center 
Resources 
Teacher Resource Center 
Online Help 
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FIGURE 1.2: “New to Maple” Section of Maple 2019 Expanded 


Commands 


To begin using Maple from the opening screen, we suggest clicking on the 
“New Worksheet” button shown in Figure 1.1. Enter commands at Maple’s 
command prompt: >. See Figure 1.3. A Worksheet shows command prompts, 
while a Document does not. 
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FIGURE 1.3: Maple’s Command Prompt 


Notation and Conventions 


Throughout this book, different types of fonts and styles are used to distin- 
guish between Maple commands, Maple output, and other information. Maple 
commands are copied directly from Maple 2019 and the output will immedi- 
ately follow the Maple commands. In the first example below, the variable a 
has been assigned an expression as its value. Notice that the symbol := is used 
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to indicate this assignment. In the second line of the example, the value of a, 
the expression, is differentiated with respect to x. 

Type the command in either the Worksheet or the Document mode as 

a := 2*x73[—>]- 5*x/6. 

(The [>] indicates pressing the “right arrow” cursor key to exit superscript 
mode.) You will type the statement this way, but the screen version appears 
as follows: 
[i 5x 


=]. —- —— 
>a x 6 


























5 
a:= 22° — g7 


[> diffa, £) 

5 
6x? — = 
L 6 

The screen version shows Maple’s mathematical interpretation of what you 


have typed. 











1.4 General Introduction to Maple 


Maple has many types of windows: the worksheet or document window, help 
window, 2-D plot window, 3-D plot window, and the animation window. Many 
“Maple Assistants” use a “maple” window. In this book, the worksheet and 
help windows are used most often. We provide a brief explanation of each. 

The worksheet is where all interaction between Maple and the user occurs. 
Within a worksheet, commands (input) and text (remarks for clarification) are 
entered by the user, and results, numerical or symbolic (output and graphics) 
are produced by Maple. The user may manipulate these interactions to create 
a flowing document that can be saved by clicking File, then clicking SaveAs 
followed by naming the document. The name of the document can be any 
word or group of letters and/or numbers, which has a length of less than nine 
characters. Once a document has initially been saved, it can be retrieved by 
clicking File, clicking Open, and entering the document name. Then it can be 
re-saved after modification by clicking File, followed by clicking Save. We will 
use Worksheet mode rather than Document mode since Worksheets show a 
command prompt, but Documents do not. 

The input and text regions of the worksheet can be modified to change a 
document, but the graphic and output regions cannot be modified once Maple 
inserts them into a document. A command must be edited and re-executed 
to alter its associated output. The input region is identified by the > prompt 
which precedes all command entries into Maple. (Note: Maple only recognizes 
the Maple generated > prompt. If the symbol > itself is typed by the user, 
Maple does not respond to it as an input prompt, but as the “greater than” 
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relation.) The input commands, output characters and text regions are all of 
different font size and color to assist the user in distinguishing between them. 
Text regions assist in documentation and explanation of the input/output 
regions of the document and they may be placed anywhere in a document. 
Graphic regions, once they are generated by input commands, can be copied 
and pasted into a worksheet or into another document. Once the graphic is 
pasted into the worksheet, it can no longer be edited or manipulated. The 
output regions are generated by the user’s input commands and cannot be 
manipulated once they appear in a document, although the user is allowed to 
delete these results. 

The Maple menu bar is located at the top of the screen on a Macin- 
tosh, and immediately below the Maple title in Windows. The menu bar 
provides easy access and collocation (collocation is defined as a sequence 
of words or terms which co-occur more often than would be expected by 
chance) of many commonly used options. The menu bar includes File, Edit, 
View, Insert, Format, Plot, Tools, Window, and Help, much like any software. 
Enter “? worksheet,reference,standard Menubar” for detailed information on 
all menus. Immediately below the Maple window’s title bar (Macintosh) or 
menu bar (Windows) is the Maple tool bar. The tool bar provides accelerated 
access to the most commonly used options; see Figure 1.4. Enter “? Worksheet- 
Toolbar” for detailed information. 
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FIGURE 1.4: Maple’s Tool Bar 


Maple syntax uses either a semicolon or a colon to end a statement. A 
single command ends with a semicolon implicitly. Multiple statements may 
be on the same line, but each must have its own colon or semicolon. The 
colon suppresses output of the command, while the semicolon signals that 
the results are to be printed to the screen immediately after the enter key is 
pressed. There is a large set of commands either readily available in Maple 
memory or stored separately in Maple packages, which assist in more efficient 
memory storage. Standard commands such as addition and multiplication are 
built-in, not contained in packages. Enter “? inifens” to see the complete list 
of functions always available. 


The Calling Sequence 


A “package” is a collection of related definitions and functions that can be 
brought into a Maple session using the with command. The syntax for with is 





[> with(<package_name>); 
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When using commands stored in packages, such as graphing commands in the 
plots package, the command “with(plots):” is issued prior to using any of the 
commands in that package. The with command is required only once. Several 
of the Maple packages will be used in this text. Specifically, plots will be used 
extensively for creating graphs, LinearAlgebra for linear algebra operations, 
and Statistics, for statistical and linear regression commands. Other important 
packages include DE'Tools for differential equations and MultivariableCalculus, 
Optimization, and simplex for optimization. 


The Help Command 


The Maple help database can provide all the information found in the Maple 
Library Reference Manual. See Figure 1.5. However, help can be obtained 
immediately to assist the user in solving problems without leaving the doc- 
ument. Help can be acquired by a number of methods: click on Help in the 
menu bar, type “help” at the > prompt, or type a “?” at the > prompt. By 
using the ? or the help at the > prompt, the user must type the keyword 
for the help search. If the specific syntax of a command is in question, for 
example, the syntax for differentiation, type ? differentiate at the > prompt. 
This procedure is perhaps the most convenient one for help on syntax. “Quick 
Help,” the last item on the tool bar, gives quick feedback. 


soo Manie 2018 Help - tmorhshesthelo 
E ara 


< = The Maple Help System 


a Welcome to the Maple Help System. 








Using the Help System 
Resources y 
Manuals 








Using the Help System See Also 
U: 4 

se the help system to: lngesror 
+ Find information on a specific topic or command. Commands 
* Browse help topics using the Table of Contents. Index of 
For more information on Maple's help system, see Details of the Maple Hel Packages 
System. Using Help 
Resources 


stews The MaplePortal acts as a starting place for any Maple user. It includes: 

* Tutorials that provide an overview of topics from getting started to 
plotting, data manipulation, and interactive application development. 

+ Navigation to portals with more information for engineers, students, and 
math educators. 








For more information on other resources, see Maple Resources. 








FIGURE 1.5: The Maple Help System 





Begin typing a command, say dif, then press | esc], the escape key. A pop- 
up menu appears with a list of commands beginning with or related to dif 
appears. Use the mouse to select the desired choice. This technique also 
works with variables or names that have been defined. Enter and execute 
my Variable := 10. Now type myVar, pause, and press |esc|. Maple fills in 
the rest of the name. If more than one possibility exists, a pop-up menu will 
appear with the choices. 
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Maple’s help database also includes a mathematical encyclopedia. Try 
entering 


[> ? Definition, integral 


Data Entry 


Commonly, a set of data will describe a process. Entering the data is the first 
step to analyzing the data. Maple provides a convenient method for manually 
entering data via a list. A list is a group of data to which Maple’s many 
operations are applied. Suppose a group of five college students’ weights are 
known to be 185, 202, 225, 195, and 145. We demonstrate the command 
required to enter the data into Maple. 


[> weights := [185, 202, 225, 195, 145]; 
weights := [185, 202, 225, 195, 145] 


The use of brackets in the command indicates a list. The brackets will 
maintain the initial ordering of the data and allows for duplicate values, and 
may be manipulated by the methods described later in this book. Commas 
are required between the list elements. Any alphanumeric element may be 
included in a list; integers, rational numbers, decimals, strings, variable names, 
and expressions are all allowed. If any data should be missing, a place holder 
can be entered in its place, such as the letter x. If there was a sixth student 
in the group with an unknown weight, a symbol could be used to represent 
the sixth student’s weight. 





> weights := [185, 202, 225, 195, 145, z]; 
weights := [185, 202, 225, 195, 145, x] 


Data Entry and Verification 


The Maple lines below illustrate entering, verifying, and naming data per- 
taining to the length and weight of bass caught during a fishing derby. In 
subsequent chapters, several models for predicting the weight of a snook fish 
as a function of length of the fish is suggested. We enter the data in rows. The 
data is printed to verify correct entry. 


| > length_inches := [12, 14, 12.5, 16, 21.5]; 
length_inches := [12,14, 12.5, 16, 21.5] 
[> weight_oz := [15,21, 10,33, 41]; 
weight_oz := (15, 21, 10, 33, 41] 
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Correcting Erroneous Entries 


In the above illustration, Maple displayed the snook fish data immediately 
after its input. This is done to help the user verify that all elements have been 
entered correctly. Reentry of each command is typically the best method for 
correcting an erroneous entry with many errors. 


Transformation and Functions 


As described, the symbol := is used to assign the value on the right-hand 
side of the statement to the name on the left-hand side of the statement. An 
example, z:=3; assigns the value 3 to x, to unassign z, use the command x=’x’. 
Functions can be defined using the mapping arrow symbol —. The function 
can be evaluated numerically or symbolically, we provide a short example. 


[> f := z > r? — 3- z +4.5; 
f := t1 z? — 3g + 4.5 


> f(5); 
14.5 


|> f(z- 2); 





(x — 2)? — 3z + 10.5 


A problem solution may suggest transforming a variable. Perhaps we want 
to transform both z and y by taking natural logarithms (ln) of each. In the 
two-dimensional case, the model suggests a functional relationship between 
a dependent and an independent variable. Given values of an independent 
variable, a function can transform the given data to yield predicted values 
for the dependent variable. Transforming data requires the understanding of 
algebraic operations and functions that are used in Maple. Table 1.1 presents 
the regular arithmetic operations that Maple recognizes. 


TABLE 1.1: Maple Arithmetic Functions 


Symbol Operation 
+ Addition 
— Subtraction 





* Multiplication 
Division 


Exponentiation 


Many other functions are also recognized by Maple. Table 1.2 has an abbre- 
viated listing. 
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TABLE 1.2: Maple Algebraic Functions 








Command Operation 

abs absolute value of real or complex argument 

arg argument of a complex number 

ceil least integer > x 

conjugate conjugate of a complex number 

exp the exponential function: e” = exp(x) = J? o x/k! 
factorial, ! the factorial function, factorial(n) = n! 

floor greatest integer < x 

In natural logarithm (with base e = 2.718...) 

logfb] logarithm to arbitrary base b 

log 10 log to the base 10 

max, min maximum/minimum of a list of real numbers 
RootOf function for expressing roots of algebraic equations 


We illustrate by taking the natural logarithm of our length and weight 
data. 
| > In_length := map(x — evalf(ln(x)), length_inches); 
In_length := [2.484906650, 2.639057330, 2.525728644, 2.772588722, 
L 3.068052935] 
[> In_weight := map(x > evalf(ln(x)), weight_oz); 


In_ weight := [2.708050201, 3.044522438, 2.302585093, 3.496507561, 
3.713572067] 





Some other functions that are available will be discussed in a later chapter 
when considering statistical operations that may be performed on columns of 
data: sums, mean, standard deviation, and so forth. Table 1.3 presents a few 
examples of data transformations, with the Maple commands. 


TABLE 1.3: Examples of Maple Commands in the Worksheet 


Expression Typed Command 
x? x2 

2r? +2549 Qex°2425*249 
(x? +2)°° — (w*2+.2)°0.5 
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Column Operations 


In this section, the operations that can be performed directly on worksheet 
columns are present. The operations require the LinearAlgebra package to 
be loaded prior to use. The following demonstrates these commands with six 
examples using two columns of data; cl := [1,2,3,4] and c2 := [5,6,7,8]. 


| > with(LinearAlgebra) : 
[> cl := (1|2|3|4); 


L cl := [1, 2,3, 4] 
[> c2 := (5|6|7|8); 





| ¿2 := [5,6,7, 8] 
Summing the vectors cl and c2 


| > VectorAdd(cl, c2); 

(6, 8, 10, 12] 
> cl + c2; 

L [6, 8, 10, 12] 


To add a constant to the entries of cl, use ‘+~’ 


[> 10+~ cl; 
(11, 12, 13, 14] 


To multiply c2 by constant, use either 


|> VectorScalarMultiply(c2, 0.5); 

L [2.50000000000000000 3.0  3.50000000000000000 4.0] 
> 0.5- c2; 

E [2.5 3.0 3.5 4.0] 


To apply a function to the elements of a vector use map or ‘~’. 


| > map(ln, c1); 
evalf(%); 
[0 In(2) In(3) 2 1n(2)] 
L [0.0 0.6931471806 1.098612289 1.386294361 | 
[> In~ (c2); 
L [In(5) In(6) In(7) 3 In(2)] 
[> map(x > x”, c2); 





[25 36 49 64] 
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Arrays and Matrices 


Arrays and matrices are structured devices used to store and manipulate data. 
An array is a specialization of a table; a matrix is a two-dimensional array. 
Both array and matriz are part of the linear algebra package, and require the 
with(LinearAlgebra): command prior to use. Table 1.4 presents a few examples 
of the use of the Array and Matrix commands. 


TABLE 1.4: Array and Matrix Commands 





Command Output 
with(LinearAlgebra): (load the LinearAlgebra package) 
a := Array({1, 2, 2,3, 4]); a:=[12234] 
1,2 
b:= Array(([1,2,8,5]); è= 
3,5 
c:= Vector[row]([9,3,1,8,3]); c:=[93183] 
7,3 
d := Matriz(([7, 3], [2, 3]]); d:= 
2,3 


Saving and Printing a Worksheet 


To start Maple from Windows or MacOS, launch Maple by double-clicking on 
the Maple icon. Once Maple has been started, it will automatically open an 
empty worksheet, with a flashing cursor to the right of a character prompt 
>. To save the worksheet, click on File and then click on SaveAs and then 
specify a name for the worksheet. Once the worksheet has been named and 
saved, click on File and then click on Save to re-save the worksheet. To open a 
previously saved worksheet, click on File and then click on Open, then specify 
the name of the worksheet. 

After re-opening a document, the document will contain the commands 
and display the results, but not have any values defined. After a command is 
executed, the result is stored in memory. If the document is closed and then 
reopened, Maple recovers the commands, but does not recall the results. 

After the completion of a document, involving commands and results, the 
document can be printed by choosing File » Print (which calls up the standard 
printing dialog). 

To quit Maple, choose File » Exit (Windows) or Maple 2019 » Quit (Mac- 
intosh); that is, choose Exit from the File menu (Windows) or Quit from the 
Maple 2019 menu (Macintosh). Saving your work is prompted when quitting 
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? 


Maple. (In the Command-line version type “quit,” “done,” or “stop” at the 
Maple prompt. CAUTION: these are the only commands that do not require 
a trailing semicolon in the Command-line version; there is no opportunity to 
save your work when using these commands.) 


Procedures 


The proc command is very useful. The following comes directly from the 
Maple Help page in Maple 2019 and explains and provides an example for a 
procedure. 











Procedures 
Calling Sequence Evaluation Rules 
Parameters Notes 
Description Examples 
Implicit Local Variables Details 


The Operands of a Procedure 





Calling Sequence 

proc (parameterSequence) :: returnType; local localSequence; 
global globalSequence; option optionSequence; description 
descriptionSequence; uses usesSequence; statement Sequence 
end proc; 


Parameters 


parameterSequence - formal parameter declarations 





returnType - (optional) assertion on the type of the 
returned value 

localSequence - (optional) names of local variables 

globalSequence - (optional) names of global variables 
used in the procedure 

optionSequence - (optional) names of procedure options 

descriptionSequence - (optional) sequence of strings describ- 
ing the procedure 

usesSequence - (optional) names of modules or pack- 


ages the procedure uses 
statementSequence - statements comprising the body of the 
procedure 


16 Introduction to Problem Solving and Maple 


Description 


e A procedure definition is a valid expression that can be assigned to a name. 
That name may then be used to refer to the procedure in order to invoke 
it in a function call. 


e The parenthesized parameterSequence, which may be empty, specifies 
the names and optionally the types and/or default values of the proce- 
dure’s parameters. In its simplest form, the parameterSequence is just 
a comma-separated list of symbols by which arguments may be referred 
to within the procedure. 


e More complex parameter declarations are possible in the parameterSe- 
quence, including the ability to declare the type that each argument must 
have, default values for each parameter, evaluation rules for arguments, 
dependencies between parameters, and a limit on the number of arguments 
that may be passed. See Procedure Parameters for more details on these 
capabilities. 


e The closing parenthesis of the parameterSequence may optionally be 
followed by ::, a returnType, and a ;. This is not a type declaration, but 
rather an assertion . If kernelopts(assertlevel) is set to 2, the type of 
the returned value is checked as the procedure returns. If the type violates 
the assertion, then an exception is raised. 


e Each of the clauses local localSequence;, global globalSequence;, 
option optionSequence;, description descriptionSequence;, and 
uses usesSequence; is optional. If present, they specify respectively, the 
local variables reserved for use by the procedure, the global variables used 
or modified by the procedure, any procedure options, a description of the 
procedure, and any modules or packages used by the procedure. These 
clauses may appear in any order. 


e Local variables that appear in the local localSequence; clause may 
optionally be followed by :: and a type. As in the case of the optional 
returnType, this is not a type declaration, but rather an assertion. If 
kernelopts(assertlevel) is set to 2, any assignment to a variable with 
a type assertion is checked before the assignment is carried out. If the 
assignment violates the assertion, then an exception is raised. 


e A global variable declaration in the global globalSequence clause cannot 
have a type specification. 


e Several options that affect a procedure’s behavior can be specified in the 
option optionSequence; clause. These are described in detail on their 
own page. 


e The description descriptionSequence; clause specifies one or more 
lines of description about the procedure. When the procedure is printed, 
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this description information is also printed. Even library procedures, whose 
body is generally elided when printing, have their description (if any) 
printed. The descriptionSequence is also used when information about 
the procedure is printed by the Describe command. 


e The optional uses usesSequence; clause is equivalent to wrapping the 
statementSequence with a use statement. In other words, 


proc ... uses LinearAlgebra; ... end proc 
is equivalent to: 
proc ... use LinearAlgebra in ... end use; end proc 


e The statementSequence consists of one or more Maple language state- 
ments, separated by semicolons (;), implementing the algorithm of the 
procedure. 


e A procedure assigned to a name, f, is invoked by using f(argumentSeq). 
See Argument Processing for an explanation of argument passing. 


e The value of a procedure invocation is the value of the last statement 
executed, or the value specified in a return statement. 


e In both 1-D and 2-D math notation, statements entered between proc 
and end proc must be terminated with a colon (:) or semicolon (;). 


Implicit Local Variables 


e For any variable used within a procedure without being explicitly men- 
tioned in a local localSequence; or global globalSequence; the fol- 
lowing rules are used to determine whether it is local or global: 

The variable is searched for amongst the locals and globals (explicit or 
implicit) in surrounding procedures, starting with the innermost. If the 
name is encountered as a parameter, local variable, or global variable of 
such a surrounding procedure, that is what it refers to. 

Otherwise, any variable to which an assignment is made, or which appears 
as the controlling variable in a ‘for’ loop, is automatically made local. 
Any remaining variables are considered to be global. 


e Note: Any name beginning with _ Env is considered to be an environment 
variable, and is not subject to the rules above. 


The Operands of a Procedure 


e A Maple procedure is a valid expression like any other (e.g., integers, sums, 
inequalities, lists, etc.). As such, it has sub-parts that can be extracted 
using the op function. A procedure has eight such operands: 
op 1 is the parameterSequence, 
op 2 is the localSequence, 
op 3 is the optionSequence, 
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op 4 is the remember table, 

op 5 is the descriptionSequence, 

op 6 is the globalSequence, 

op 7 is the lexical table (see note below), and 


op 8 is the returnType (if present). 


e Any of these operands will be NULL if the corresponding sub-part of the 
procedure is not present. 


e Note: The lexical table is an internal structure used to record the corre- 
spondence between undeclared variables and locals, globals, or parameters 
of surrounding procedures. It does not correspond to any part of the pro- 
cedure as written. 


Evaluation Rules 


e Procedures have special evaluation rules (like tables) so that if the name 
f has been assigned a procedure, then: 
f evaluates to just the name f, 
eval(f) yields the actual procedure, and 
op(eval(f)) yields the sequence of eight operands mentioned above (any 
or all of which may be NULL). 


e Within a procedure, during the execution of its statementSequence, 
local variables have single level evaluation. This means that using a vari- 
able in an expression will yield the current value of that variable, rather 
than first evaluating that value. This is in contrast to how variables are 
evaluated outside of a procedure, but is similar to how variables work in 
other programming languages. 


Notes 


e Remember tables (option remember) should not be used for procedures 
that are intended to accept mutable objects (e.g., rtables or tables) as 
input, because Maple does not detect that such an object has changed 
when retrieving values from remember tables. 


Examples 


| > lc := proc( s, u, t, v ) 

description “form a linear combination of the arguments”; 

s-u+t-v; 

end proc; 

lc := proc(s, u,t, v) 
description "form a linear combination of the arguments"; 
sxu+t*v 

end proc 
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| > le(r, 2 —I,y) 
L ma—Ily 
| > Describe(Ic) 
# form a linear combination of the arguments 
| le(s, u, t, v ) 
[> le 
L le 
| > eval(Ic) 
proc (s,u,t, v) 
description "form a linear combination of the arguments"; 
S*UtT#U 
end proc 


| > op(1, eval(Ic)) 
L s, u, tu 
| > addList := proc(a::list, b::integer)::integer; 
local x,i,s; 
description “add a list of numbers and multiply by a constant”; 
xi=b; 
s:=0; 
for iin a do 
s:=s+ali]; 
end do; 
Si=s*x; 
end proc; 
addList := proc(a:: list, b :: integer) :: integer; 
localz, i, s; 
description) “add a list of numbers and multiply by a constant”; 
x := b; s := 0; for iin ado s := s +alijend do;s:= s- x; 
end proc 


[> sumList := addList((1, 2, 3,4,5), 2) 
sumList := 30 





Details 
For details on defining, modifying, and handling parameters, see Procedure 
Parameters. 


See Also 

_nresults, assertions, Function Calls, kernelopts, Last-Name Evaluation, Pro- 
cedure Options, ProcessOptions, procname, Reading and Saving Procedures, 
remember, return, separator, Special Evaluation Rules, use 
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We add an example procedure for using Newton’s Method to find the roots 
of a differentiable function. First, recall the iterating formula for Newton’s 
root-finding algorithm: 


f (ota) 
f' (Zola) 


We iterate the formula, finding new values of £new, until the absolute dif- 
ference |Unew — Loia| < tolerance or |f(%new)| < tolerance. 





Trew = Told — 


The Maple procedure: Newton’s Method for Finding Roots of a polyno- 
mial. 
Assumptions: 


1. The function must be differentiable. 


2. You have a fair guess at a starting point xo for the method. We suggest 
plotting the function and estimating a value near the desired root. 


Algorithm: 


1. Pick a tolerance, tol (tol is small), and a maximum number of iterations, 
MazN, allowed. 


2. Pick an initial value xold. 
3. Iterate znew := xold — f(xold) / f’(xold). 


4. Stop when either |vnew — xold| < t or |f(x(new))| approximately equals 
0. 


Maple Procedure: 


| > Newton := proc(f, x0, tol, MaxN) 
local df, cold, Znew, 1; 
df := D(f); 
zold := x0; 
print (cold); 
for i from 1 to MazxN do 
znew := xold — diod 
` f'(xold)’ 
if |f(anew)| < tol or |anew — xold| < tol then 
return(xnew); 
end if; 
zold := znew; 
print(zold); 
end do; 
return(cat( MazN,” iterate "), zold); 
end proc: 
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Let’s consider an example. 
[> f := xz > (x —5)? — 3; 
g f:=x1 (x-5) — 3; 
We plot the function to be able to estimate the roots for the procedure. 


|> plot( f(x), = —1..10, thickness = 3, color = black); 











[> Newton(f,2.0,107°, 30); 


2.0 
3.000000000 
3.250000000 
3.267857143 
3.267949190 
3.267949192 
3.267949192 


[> f(3.267949192): 
L 1. 107° 
[> Newton(f,7.0,107°,30); 
7.0 
6.750000000 
6.732142857 
6.732050810 
6.732050808 
i 6.732050808 
[> f(6.732050808); 


1. 107° 
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We found approximations to the two roots as x = 3.2679449192 and z = 
6.732050808. 


We will often create procs in both volumes of this text to solve problems 
using specialized techniques. 


A Quick Review of Key Commands 
Assignments and Basic Mathematics 

For example, let’s compute (1.241 + 4.3(9.8)) /34. 
[> (1.2414 4.3 - (9.8)) /34; 





1.301522146 


All Maple statements are entered after the > prompt. Again note that 
Maple commands end with a semicolon; in the document interface, semicolons 
are optional. Maple output, printed in blue, is centered in the page. 

To assign a label or name to a number or an expression, we use :=. For 
example, let’s use two Maple statements to assign a the value 11.5 and b the 
value 9. 


[> a:= 11.5; 
a := 11.5 





b:=9 


We can enter multiple commands on the same line or in the same input 
cell. If we separated the two commands with a colon instead of a semicolon, 
only the second command would be displayed in the output even though both 
commands were executed. A colon suppresses output. 


[> a:= 11.5: b:= 9; 





b:=9 


Once the assignments have been made we can perform arithmetic opera- 
tions. For example, let’s compute a? + b°, a? - b?, and Va? + b3. 


[> a? +b’; 
az ; b3: 
sqrt(a? + b?); 
861.25 
96410.25 





29.34706118 


A very useful command is evalf. This command produces the decimal 
equivalent of a given expression. Its mnemonic is evaluate as floating point 
(decimal). 
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| > c:= evalf(a?/b?); 
0.1814128944 


For expressions, we assign y using the same assignment operator :=. 





[> y:= z? 4+21.6-¢-1 





E y := x? + 21.6g — 1 
To evaluate this type of expression for specific values of x, we use the subs 
command (substitution) or the eval command (evaluate). For example, we 


want to substitute 3 for x. 

[> subs(x = 3, y); 

72.8 
| > eval(y, x = 3); 





72.8 


To use functional notation, such as f(3), we must start with a different form 
of assignment. To create a function assignment f we use the arrow operator as 
follows. (Type ‘—’, then ‘>’ for the arrow operator; the image changes to ‘—’.) 


[> f := £ > x? +21.6- x -— l; 
f := z r? 42162—-1 


> FB) 
i 72.8 
We can even substitute variables such as (x + h) for z; 
[> f(x +h); 





(x + h)? + 21.62 + 21.6h —1 


The two forms, expressions (objects) and functions (operations), are very 
important in both programming and plotting as we shall see later. Creating 
a function from an expression is easy using the unapply command. 


[> y:= z’ +3- cos(x) — 4; 

L x® + 3cos(x) — 4 

[> f := unapply(y, x); 

f := x£ > x? + 3cos(x) — 4 


[> f(x); 
x? + 3cos(x) — 4 


> f(2); 





4 + 3 cos(2) 


Maple can easily handle functions of more than one variable. For example, 
consider a surface area defined by m(x?°y? + 3). Suppose we want to evaluate 
the surface area at the point (2,5). 
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[> s := (x,y) 3 Pi- (a? - y> + 3)’ 
s := (x,y) œ m(x? -y3 +3) 
> s(2,5); 
5037 
> evalf(%) 





1580.221105 


The expression evalf(%) contains the symbol ‘%’ that means insert the 
result of the last expression evaluated. (This may not be the result just above 
in your Maple document if you’ve re-executed another statement.) 


Algebra and Calculus 
Let’s return to our expression y = x? + 21.6x — 1. 


|> y = z? + 21.62 — 1; 





y = z? + 21.6x — 1 


Let’s factor y. We can use the factor command or the solve command. The 
solve command has more utility. 


[> factor(y); 
i (x + 21.64619749) (x — 0.04619749036) 
[> solve(y = 0); 





0.04619749036, —21.64619749 


Let’s consider the function q(x) = az? +bx +c. We use the solve command 
and obtain the result: 


[> restart; 


[> q:= £ > az? +br +c 
E q:= £ az? +br +c 
|> solve(q(x) = 0, x); 
—b + V—4ac +b? —b+ V—4ac +b? 
L 2a , 2a 

Note the results are the quadratic formula. We also used the command 
restart. A restart forgets all previous assignments to f, a, b, c, and anything 
else we defined. 

In calculus, we can differentiate and integrate in one and many variables. 
The commands for differentiation and integration are: 
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diff & Diff: differentiation or partial differentiation and ‘indefinite’ differ- 
entiation 


int & Int: definite and indefinite integration 


For example, let’s differentiate an expression y = 2x? +24.1x—1, and then 
find the area under the curve in general and from x = 1 to x = 4. 
[> y := 2z? +24.1-£— 1 
y := 2z? + 24.1xz — 1 


[> Difly, x) 
d 
— (2x? + 24.12 — 1) 
Les L 
[> diffly, x) 
4x + 24.1 
[> Int(y, 2) 
L f (22? + 24.14 — 1) dx 
[> int(y, x) 


E 0.6666666667x° + 12.05000000x? — x 
| > Int(y, x = 1..4) 


4 
/ (2x? + 24.1¢ — 1) dx 
L 1 
| > int(y, £ = 1..4) 
219.7500000 


Note the difference between diff and Diff. The capital letters indicate the 
“inert form” of the command. 

Often, we want to find critical points of the first derivative (where y’ = 0). 
We can use the solve command as follows, 


| > solve(diff(y, x) = 0, £); 








—6.025000000 


Plotting and Graphs 


Maple has an extremely detailed and developed plot command, which provides 
graphs in both two and three dimensions. For modeling purposes, 2-D plots 
will typically be used. We suggest loading the plots package, via with(plots), 
prior to plotting. The syntax for plot is plot(y, hr, vr, options), where y is 
the expression to be plotted, hr is the horizontal range and vr is the vertical 
range. Additionally, many other options can be added after the vertical range 
to control a variety of items. Table 1.5 presents a short list of options (defaults 
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values are in the third column). See the Maple Help topic “plot/options” for 
the full list. 


TABLE 1.5: Plot Command Options 





Option Description Default 
scaling = constrained or unconstrained unconstrained 
style = point, line, patch or patchnogrid line 

title = “a title” no title 
thickness = 0, 1, 2, or 3 1 

ares = framed, boxed, normal or none normal 

view = [emin..zmaz, ymin..ymaz] entire graph 


Continuous Plots 


The command, plot(y): will generate a 2-D plot of y with a default horizontal 
range of —10..10 and a vertical range that shows the curve. No other command 
information is required; however, the axes ranges and options may be specified 
to generate a specific plot. The default range can be specified as a finite range 
or an infinite range. By constraining the scale, equal units occur in both the 
x and y directions. However, a plot is generally easier to see when the scale 
is unconstrained, although it would be distorted (e.g., a circle would appear 
as an ellipse). Maple automatically scales the axes to spread the data over as 
large a space as possible, but this procedure does not imply that the area of 
interest will be plotted most effectively. As a result, the view option must be 
employed to ensure that the correct portion of the plot is best displayed. To 
demonstrate the plot command with a variety of options, a few examples are 
provided of sin(x) below, the first with the default ranges, the second with an 
infinite range. 
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| > plot(sin(x), color = black, thickness = 3); 


17 











> plot(sin(x), x = 0..infinity, color = black, thickness = 3); 











Scatterplots 


The previous plots demonstrate functions with continuous x-values; either 
defaulted to —10..10 or selected by the user. However, Maple can also plot 
discrete sets of data. Using the data provided in Table 1.6, we present an 
example of plotting the ages of five people versus their respective weights. 
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TABLE 1.6: Age-Weight Data 


Age (years) | 1 5 13 17 24 
Weight (Ibs) | 15 40 90 160 180 





[> age := (1,5, 13,17,24] : 

| wt := [15, 40,90, 160, 180] : 

| > agewt := {seq([age[k], wt[k]], k = 1..5)} 

L agewt := {[1, 15], [5, 40], [13, 90], [17, 160], [24, 180] } 

|> plot(agewt, style = point, symbol = diamond, symbolsize = 14) 


180 > 
160 š 

140 

120 


100 


20 





We could also have used 


[> with(plots) : 
pointplot(agewt, symbol = diamond, symbolsize = 14); 





Multiple Plots 


It is also an option to plot multiple functions on one set of axes. One method 
requires that both functions have the same domain. The second method 
uses the display command which requires the plots package be loaded via 
with(plots):. This method does not restrict the domain, as presented below. 
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| > myOptions := (scaling = constrained, azes = framed, thickness =2) : 
plot({sin(), 2°], x = 0..5, myOptions); 








| > with(plots) : 
[> curve := plot(30 - sin(x), x = 0..30) : 
| > points := pointplot(agewt, symbol = diamond, symbolsize = 14) : 
| > display(curve, points); 

1504 


100+ 


507 
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An intriguing method of adding curves to an existing graph is to select the 
output (blue font on your screen) expression desired, then “drag and drop” 
that selection onto an existing plot image. Try it! 





1.5 Maple Training 


A Maple training video of the basic features is provided at the website: 
https: //www-.maplesoft.com/support /training/quickstart.aspx. 

The website has a self-contained video tutorial “Maple Quick Start.” Maple’s 

basic features covered include: 


e Numeric Computations 


Symbolic Computations 
e Programming Basic Maple Procedures 
e Visualization 


e Calculus 


Linear Algebra 
e Student Packages 
e Other Maple Packages 


This training serves as a basic refresher of Maple commands. The website also 
has a “QuickStart” PDF reference. 








1.6 Maple Applications Center 


The Maple Applications Center found at 

https: //www-.maplesoft.com/applications/ 
contains a large collection of worksheets available to students and practitioners 
alike. The authors have several worksheets available on the site. 





ee 
Exercises 


Perform the following operations in Maple: 
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1. /2.35(45) 

2. 11.3? + 5.1? 

3. 21.6 

4. Let a = 8, b = 7, then compute (a? — b°). 
5. 2(11.5) + 6.22(0.7) 


Enter the following functions. Obtain a graph. Find roots and solve for the 
intercepts. 


6. f(z) =r? +3243 
7. f(z) =2? —32-1 
8. f(x) = —0.1213x+ + 3.46223 — 29.222 + 64.682 + 97.69 
9. g(x) = x? — 2x? — 5x +6 
10. s(x) = 2z? — 3z? — 11x +7 
Enter the following date sets into Maple and obtain a scatterplot for each: 


|1 3 8 10 


11. x 
y | 0.7 5 15.2 36 





12. t|7 14 21 28 35 42 
P|8 41 133 250 280 297 





13. z | 29 48 72.7 92 118 140 165 199 
y | 0.49 0.82 1.23 1.54 1.97 2.34 2.74 3.30 





Perform the required function in Maple. 


14 1.1042 — 0.5422”) 


1.1042 — 0.5422? dx 


15. f 
5 

16. f 1.1042 — 0.5422? dx 
1 
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Discrete Dynamical Models 











Objectives: 

(1) Define and build a discrete dynamical system for a real problem. 
(2) Iterate a solution. 

(3) Graph a solution. 

(4) Understand equilibrium values and stability. 


(5) Solve both linear and nonlinear systems of discrete dynamical 
systems. 


(6) Solve systems of discrete dynamical systems. 





2.1 Introduction 


Consider a disease that is spreading throughout the Unites States such as 
a new influenza or flu. The U.S. Centers for Disease Control and Preven- 
tion (CDC) is interested in knowing and experimenting with a model for this 
new disease prior to it actually becoming a “real” epidemic.! Let us consider 
the population being divided into three categories: susceptible (able to catch 
this new flu), infected (currently has the flu and is contagious), and removed 
(already had the flu and will not get it again, or has died from the flu). We 
make the following assumptions for our model: 


e No one enters or leaves the community. 

e There is no contact outside the community. 

e Each person is either susceptible S, infected J, or removed R. 
e Initially, every person is either an S or I. 


e Once someone gets the flu this year, they cannot get again (‘developed 
immunity’). 





1The CDC website “FluView” monitors the current situation in the U.S. 
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e The average length of the disease is 2 weeks over which the person is 
deemed infected and can spread the disease. 


e The time step for the model will be one week. 


Can we build a model to determine useful information to the CDC? 
We will revisit this model later in the chapter. 








2.2 Modeling with Dynamical Systems? 


Buying a new car is a major purchase. Once you have looked at the makes 
and models to determine what type of car you like, it’s time to consider the 
“costs” and finance packages available. Payments are made typically at the 
end of each month. The amount owed is predetermined depending on the total 
price of the car, fees and charges, and the interest rate of a loan. This process 
can be modeled as a dynamical system. 

Another name for a discrete dynamical system (DDS) is a “difference equa- 
tion.” We will use DDS extensively in this chapter. Let’s begin with basic 
definitions: 

A sequence is a function whose domain is the set of all nonnegative inte- 
gers and whose range is a subset of the real numbers. A dynamical system is 
a relationship among the terms in a sequence; a DDS describes the evolution 
of an item over time. A numerical solution is a table of values satisfying 
the dynamical system. 

Let’s start with a brief review of the concepts of ‘relation’ and ‘function’ 
since these are foundational in the definition above. A relation is a set of 
ordered pairs, often written as (input, output). As an ordered pair, a relation 
can indicate how one variable depends on another. The domain of a relation 
is the set of all first coordinates of the ordered pairs, the range is the set 
of all second coordinates of the ordered pairs. A function is a special type of 
relation. A function is a relation for which each element in the domain has 
exactly one related element in the range. The domain of a function is the set 
of all of the independent variables (allowed inputs) and the range is the set of 
all possible dependent values (outputs). 

The concepts of function, domain, and range can be seen clearly in a 
dynamical system model. Let’s define a recurrence relation, a relation where 
the next term depends on the previous values, as an equation of the form 


a(n+1) = f(a(n), a(n—1),..., a(0), 2) or Anti = f(@n,Qn—1,---, 40,7). 


Using subscripts for arguments makes the notation much more compact and 
easy to read. 





?Modified from USMA Math 103 Study Guide. 
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Recurrence relations, also called recursive formulas, occur in many 
branches of applied mathematics. A discrete dynamical system (DDS or 
difference equation) is a sequence defined by a recurrence relation. A DDS is a 
“changing system,” where the change of the system at each discrete iteration 
depends on its previous states. Suppose we have a function y = f(x). A first- 
order discrete dynamical system is given by the sequence of numbers a(n) for 
n = 0,1,... such that each number after the first is related to the previous 
number by 

a(n +1) = f(a(n)). 
(Many books and articles refer to the relationship a(n + 1) — a(n) = g(a(n)) 
as a first-order difference equation.) 

The order of a dynamical system is the difference between the largest and 
the smallest arguments (or variable of the function) appearing in the formula. 
For example, A(n + 2) = 0.5A(n +1) + 2A(n) is a second order DDS because 
n + 2, the largest argument, minus n, the smallest, gives (n + 2) —n = 2. 


Example 2.1. A Simple First-Order Discrete Dynamical System. 

Define the dynamical system A to be the amount of antibiotic in a patient’s 
blood stream after n time periods. The domain is the nonnegative integers 
representing the time periods from 0, 1, 2, ..., n that will be the inputs to 
the function. Since the domain is discrete, we have a DDS. The range is the 
set of values of A(n) determined for each value of the domain. Thus, A(n) also 
represents the dependent variable. For each input value of the domain from 0, 
1, 2,..., n, the result is one and only one amount A(n), thus A is a function. 


There are three components to a dynamical system: 
e a formula for the sequence representing A(n), 
e the time period n is well defined, and 


e at least one starting value (no, A(no)). (The number of required starting 
values corresponds to the order of the DDS.) 


The starting value set is called the initial condition(s). In the example 
above, if we start with no antibiotic in our system then A(0) = 0 mg is our 
initial condition. However, if we started after we took an initial 200 mg tablet, 
then A(0) = 200 mg would be our initial condition. An example of a discrete 
dynamical system with its initial condition would be: 


A(n + 1) = 0.5A(n) 
A(0) = 200 


We are mainly concerned with one of several aspects of a DDS: Does the DDS 
have a stable equilibrium? What is the value of the system after period n for 
a specified n. What is the long-term behavior for the DDS? 

Next, let’s look at Maple commands that can assist us with investigating 
a DDS. 
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Maple Commands for Discrete Dynamical Systems 


We will use familiar commands and libraries from Maple such as plots, and 
we will add new commands to our Maple tool box. 





rsolve — recurrence equation solver 


Calling Sequence 
rsolve( eqns, fens) 
rsolve(eqns, fens, ’genfunc’(z)) 
rsolve(eqns, fens, ’makeproc’) 
rsolve( eqns, fens, ’series’) 
Parameters 
eqns- single equation or a set of equations 
fens — function name or set of function names 
z — name, the generating function variable 








seq — create a sequence 


Calling Sequence 
seq(f,m..n) 
seq( f, i = m..n) 
seq( f, i = m..n, step) 


seq( f, i = x) 
Parameters 

f — -any expression 

a — name 

m, n— numerical values 

x — expression 


step — (optional) numerical value 











Look at Maple’s help pages for rsolve and seq for detailed information and to 
see many examples. 

Several of the models that we will solve have closed-form solutions; we can 
use rsolve to obtain a formula, then we can use seq to obtain numerical values 
of the solution. 

Many dynamical systems do not have closed-form analytical solutions, so 
we cannot use rsolve and seq to obtain solutions. When this occurs, we will 
write a small program using proc to obtain the numerical solutions through 
iteration. To graph the solution of the dynamical systems, we will use plot 
commands to see sequential data pairs (k, A(k)). We will illustrate Maple use 
in our examples in the next section. 
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2.3 Linear Systems 


We are interested in modeling discrete change. Modeling with discrete dynam- 
ical systems employs a method to explain certain discrete behaviors or make 
long-term predictions. A powerful paradigm that we use to model with discrete 
dynamical systems is: 


future value = present value + change 


The dynamical systems that we will study with this paradigm will differ 
markedly in appearance and composition, but we will be able to solve a large 
class of these “seemingly” different dynamical systems with similar methods. 
In this chapter, we will use iteration and graphical methods to answer ques- 
tions about discrete dynamical systems. 

We will use flow diagrams to help us see how the dependent variable 
changes. Flow diagrams help to see the paradigm and to put the problem 
into mathematical terms. Let’s consider financing a new Ford Mustang. The 
total cost is $25,000. We can put down $2,000, so we need to finance $23,000. 
The dealership offers us 2% APR financing over 72 months. Consider the flow 
diagram for financing the car below in Figure 2.1 that depicts this situation. 


Interest Payment 
———"| Amount Owed a 


FIGURE 2.1: Car Financing Flow Diagram 


We’ll use the flow diagram to help build the discrete dynamical model. 
Let A(n) = the amount owed after n months. Notice that the arrow pointing 
into the Amount Owed is the interest which increases the unpaid balance. The 
arrow pointing out of the oval is the monthly payment which decreases the 
debt. 


A(n + 1) = amount owed next month (future) 
A(n) = amount currently owed (present) 
i = monthly interest rate 


P = monthly payment 
then our paradigm future = present + change gives 
A(n +1) = A(n) + (i: A(n) — P) 


We will model dynamical systems that have constant coefficients. For 
example, a third-order discrete dynamical system with constant coefficients 
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that is homogeneous may be written in the form 
a(n + 3) = boa(n + 2) + bia(n + 1) + boa(n) 


where bo, b1, and bg are arbitrary constants. If we added a term not involving 
a(n)s to the right side, this DDS would be nonhomogeneous. 

The “Tower of Hanoi” puzzle, that we will illustrate shortly, involves mov- 
ing a tower of disks from one pole to another. The number of discrete moves 
H(n) that it takes to move n disks depends on the number of discrete moves 
it took to move n — 1 disks. For the drug dosage model of Example 2.1, the 
model shows the amount of drug in the bloodstream after n hours depends on 
the amount of drug in the bloodstream after n — 1 hours. For financial mat- 
ters, such as our Mustang purchase, the amount we still owe after n months 
depends upon the amount we owed after n — 1 months. We will also find this 
process based on transitioning states useful when we discuss Markov chains 
as an example of a discrete dynamical system. 

Recall that a first-order DDS is a system where the next iteration a(n), the 
state of the system after n iterations, is related only to the previous iteration 
a(n — 1) by the relation 


a(n) = f (a(n —1)); 


i.e., a(n) is a function of only a(n — 1) for n = 2, 3, .... For example, a(n) = 
2a(n — 1) is a first-order DDS. If the function f of a(n) is just a constant 
multiple of a(n), a constant coefficient, we will say that the discrete dynamical 
system is linear. If the function f involves powers (like a(n)?), or a functional 
relationship (like a(n) /a(n — 1) or sin(a(n —1))), we will say that the discrete 
dynamical system is nonlinear. 


Example 2.2. Iteration with the Tower of Hanoi. 

The Tower of Hanoi puzzle was invented in 1883 by the French mathematician 
Edouard Lucas (1842-1891) under the pseudonym “Professor N. Claus (of 
Siam) from the Mandarin of the College of Li-Sou-Stian.”? The puzzle consists 
of a board with three upright pegs and disks (now usually 6 to 10) with 
successively smaller outside diameters. The disks begin on the first peg with 
the largest disk on the bottom, topped by the next largest disk, and so on up 
to the smallest disk on top forming a tower or pyramid as we see in Figure 2.2. 


The object of the game is to transfer the disks, one at a time, using the 
smallest possible number of moves, to the third peg to form an identical pyra- 
mid. During each transfer step, a larger disk may not be placed on top of a 
smaller disk—this is why a second and third pegs are needed. Lucas’ original 
rules were the following. 





3The puzzle is also called the “Tower of Brahma” or the “Tower of Lucas.” 
There are several intriguing legends about the game. See Paul Stockmeyer’s webpage 
http://www.cs.wm.edu/~pkstoc/toh.html for the original game box and instructions. 
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FIGURE 2.2: Tower of Hanoi Puzzle* 


The game consists of moving this, by threading the disks on another 
peg, and by moving only one disk at a time, obeying the following rules: 


I. After each move, the disks will all be stacked on one, two, or three 
pegs, in decreasing order from the base to the top. 


II. The top disk may be lifted from one of the three stacks of disks, 
and placed on a peg that is empty. 


III. The top disk may be lifted from one of the three stacks and placed 
on top of another stack, provided that the top disk on that stack 
is larger. 


Lucas’ original La Tour d’Hanot game had eight disks. The puzzle was a 
model to illustrate moving the legendary “Sacred Tower of Brahma,” which 
actually had “sixty-four levels in fine gold, trimmed with diamonds.” This 
tower was attended by monks of the Temple of Bernares who moved one disk 
a minute according to the long-established ritual that no larger disk could 
be placed on a smaller disk. The monks believed, so Lucas’s story went, that 
as soon as all sixty-four disks were transferred, the earth would collapse in a 
cloud of dust. 

This legend is somewhat alarming, so let’s use induction to find a formula 
for the number of moves required to transfer n disks from the first peg to the 
third peg. Suppose that we have n + 1 disks. Then we can move n smaller 
disks from the first peg to the second peg in a(n) moves, the number of moves 
for n disks. Then we move the largest disk from the first peg to the third peg. 
Finally we move the n smaller disks from the second peg to the third peg in 
a(n) moves. Therefore, a(n+1) = (a(n)+1)+a(n) = 2a(n)+1. So the recursion 
relation that gives the number of moves required is a(n + 1) = 2a(n) + 1; the 
initial value is a(1) = 1, meaning that we start with the case where there 
is only one disk to move. Figure 2.3 shows a plausible flow diagram for the 
Tower of Hanoi. 





+Photo source: Bjarmason (2005). Creative Commons License. 
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1 Moves after 2a(n) 
n disks 


FIGURE 2.3: Tower of Hanoi Flow Diagram 


There is a formula for the number of moves required to move n disks 
from the first peg to the third peg. We will consider methods of solving this 
recursion relation to find this formula. The first method is iteration. Later we 
will learn a better method by explicitly solving the DDS. 

Let’s build Table 2.1 by iterating the recursion relation 


a(n + 1) = 2a(n) +1 
a(1) =1. 


TABLE 2.1: Tower of Hanoi Recursion 














Number of Disks Recursion Relationship Number of Moves 

1 a(1) = 1 

2 a(2) = 2a(1) +1 = 2(1) +1 3 

3 a(3) = 2a(2) + 1 = 2(3) +1 7 

4 a(4) = 2a(3) + 1 = 2(7) +1 15 

5 a(5) = 2a(4) + 1 = 2(15) +1 31 

6 a(6) = 2a(5) + 1 = 2(31) +1 63 

T7 a(7) = 2a(6) + 1 = 2(63)+ 1 127 

n a(n) = 2a(n— 1)+1 2a(n—1) +1 


Using iteration, we repeatedly calculate the recursion relation, each time 
for the next value of n. To find out how many minutes it would take to move 
sixty-four disks, we would have to iterate the given recursion relation 64 times. 
We’ll use Maple to help us with the computations for tabulating Table 2.2. 
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TABLE 2.2: Tower of Hanoi with Sixty-Four Disks 


Disks Moves Required 





1 1 

5 31 

10 1023 

15 32767 

20 1048575 

25 33554431 

30 1073741823 
35 34359738367 
40 1099511627775 
45 35184372088831 
50 1125899906842623 


55 36028797018963967 
60 1152921504606846975 
64 18446744073709551615 


The table above shows that 18446744073709551615 œ 1.84467 x 101° min- 
utes are needed. Let’s convert to units that are equivalent. There are 1440 
minutes in a day and 525,600 minutes in a year. At one move per minute, 
a(64) is equivalent to 3.5135 x 10!3 years. Imagine how long that amount of 
time really is. (The estimated age of the universe is 1.37 x 10!° years.) 

The power of discrete dynamical systems is that they can always be solved 
by iteration. Between the table of iteration values and a graph of those values, 
we are able to analyze difficult modeling problems. 

In Maple, to obtain the iteration values, we can either write a procedure 
with proc or, when the recurrence has a closed-form solution, use the rsolve 
and seq commands. We illustrate both below to obtain the values seen in 
Tables 2.1 and 2.2. First, rsolve and seq. 


| > rsolve({H(n+1) =2-H(n) +1, H(1) = 1}, H(n)); 
L —1 +2” 
We have a closed-form analytical solution H(n) = 2” — 1. If there had not 
been a formula, rsolve would just have echoed the input. Using rsolve is a 
quick way to see if a DDS has a closed-form solution. 

Now we can use seq with the formula from rsolve to generate values. Also, 
since there is a formula, we don’t have to iterate through all n up to the num- 
ber we want. 
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| > seq([n, 2” —1],n=1..64,7); Note the step size of 7 
[1, 1], (8, 255], [15, 32767], [22, 4194303], [29, 536870911], [36, 68719476735], 
(43, 8796093022207], (50, 1125899906842623], [57, 144115188075855871], 
(64, 18446744073709551615] 





When a dynamical system does not have a closed-form analytical solution, 
the rsolve command echoes its input, so we cannot use a formula in the (seq) 
command to generate values. We’ll need to use proc to create a program 
to iterate the discrete dynamical system. Call the program Tower and use 
inputs n0, t0, and n where nO is the first nonnegative integer in our domain, 
t0 is our first range value (that is, (n0,t0) is the initial condition), and n is 
the nonnegative integer value from our domain that we need. We use ‘option 
remember’ so the program stores and can recall the previously iterated values. 
Without remembering values, a simple recursion can take huge numbers of 
calculations to compute a value. For example, the Fibonacci numbers given 
by F(n) = F(n—1)+ F(n—2) take on the order of 2” calculations to compute 
F'(n) when previous values aren’t remembered. 


| > Tower := proc(n0, t0,n) 
option remember; 
local T; 
if n > n0 then 
T :=2- Tower(n0,t0,n — 1) + 1; 
else 
T := t0; 
end if, 
return T; 
end proc: 





Now we can use seq with the procedure Tower to generate values. (Tower 
is in the book’s PSMv2 Maple package.) Also, since there is a formula, we 
don’t have to iterate through all n up to the number we want. 


| > seq([n, Tower(1,1,n)],n =1..64,7); Note the step size of 7 
[1, 1], (8, 255], [15, 32767], [22, 4194303], [29, 536870911], [36, 68719476735], 
[43, 8796093022207], (50, 1125899906842623], [57, 144115188075855871], 
(64, 18446744073709551615] 





Are the values the same as with the formula? 
Let’s plot the recursion. 
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| > pts := seq({n, Tower(1,1,n)],n = 1..15) : 
plot([pts], style = point, symbol = solidcircle, symbolsize = 14); 
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200004 


10000- 








It’s easy to see how quickly this recursion grows. 


Example 2.3. A Drug Dosage Problem. 

A doctor prescribes an oral dose of 100 mg of a certain drug every hour for a 
patient. Assume that the drug is immediately absorbed into the bloodstream 
once taken. Also, assume that every hour the patient’s body eliminates 25% of 
the drug that is in the bloodstream. Suppose that the patient’s bloodstream 
had 0 mg of the drug prior to taking the first pill. How much of the drug will 
be in the patient’s bloodstream after 72 hours? 


General Problem Statement. 
Determine the relationship between the amount of drug in the bloodstream 
and time. 


Assumptions. 

e The problem can be modeled by a discrete dynamical system. 

e The patient is of normal size and health. 

e There are no other drugs being taken that will affect the prescribed 
drug. 

e There are no internal or external factors that will affect the drug 
absorption or elimination rates. 

e The patient always takes the prescribed dosage at the correct time. 


Variables. 
Define a(n) = amount of drug that is in the bloodstream after a period of 
n = 0, 1, 2, ... hours. 
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Flow Diagram: 
We diagram the flow in Figure 2.4. 








Amount of Drug 
in the System 
after n Hours 
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—$ ___> 





25% Eliminated 







FIGURE 2.4: Amount of Drug in the Bloodstream Flow Diagram 


Model Construction. 


Let a(n) represent the amount of drug in the system after time period 
n. We calculate the change in the drug’s amount as: change = dose — 
system’s loss. The the future = present + change paradigm gives 


a(n + 1) = a(n) — 0.25 - a(n) + 100 
= 0.75 - a(n) + 100 


From the DDS we see that since the body loses 25% of the amount of drug 
in the bloodstream every hour, there would be 75% of the amount of drug in 
the bloodstream remaining every hour. After one hour, the body has 75% of 
the initial amount, 0 mg, plus the dose of 100 mg that is added every hour. 
The bloodstream has 100 mg of drug after one hour. After two hours the body 
has 75% of the amount of drug that was in the bloodstream after one hour 
(100 mg) plus an additional 100 mg of drug added from the new oral dose. 
There would be 175 mg of drug in the bloodstream after two hours. After 
three hours the body has 75% of the amount of drug in the bloodstream after 
two hours plus an additional 100 mg of drug added to the bloodstream. Thus 
there would be 231.25 mg of drug in the bloodstream after three hours. 

The values are tabulated in Table 2.3. 


TABLE 2.3: Amount of Drug in the Bloodstream - First Computations 








Hour Amount of Drug (mg) 
0 a(0)=0 
1 = a(1) = 0.75 - a(0) + 100 = 100 
2  a(2)= 0.75- a(1) + 100 = 175 
3 a(3) = 0.75 - a(2) + 100 = 231.25 


We can see the change that occurs every hour within this system (amount 
of drug in the bloodstream), and the state of the system after any hour, is 
dependent on the state of the system after the previous hour. This is a discrete 
dynamical system (DSS). 
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To find the value of a(72) we can either iterate the recurrence with seq or 
use rsolve to obtain a formula. Let’s start with iteration. (Remember to either 
use restart or open a new Worksheet to have a fresh Maple environment.) 

Since this is a simple recurrence, we’ll use a ‘short-cut’ procedure. 

[> a:= n —> piecewise(n > 0,0.75 - a(n — 1) + 100, 0); 


‘ila ~1)+100 0<n 
a := n e> 


0 otherwise 





Asking Maple for a(72) will perform the iterations and display the result. 
[> a(72); 





399.9999996 


We could see all the iterates by using seq([k, a(k)], k = 0..72). 
Let’s attempt to find a formula by using rsolve. (Remember to restart.) 


| > rsolve({a(n + 1) = 0.75 - a(n) + 100, a(0) = 0}, a(n)); 


3 n 
400 — 400 | — 
00 — 40 (3) 


Note that, in this case, Maple returned an exact answer even though we 
entered decimals. 
What is the long-term level of drug in the patient’s bloodstream? 





| > limit(400 — 400 - 0.75", n = infinity); 

L 400. 

Checking iterates would show that a patient’s bloodstream would have approx- 
imately 400 mg of the drug after 24 hours. 

It’s time for a plot. 





[> pts := [seq([n, 400 — 400 - 0.75”], n = 0..48)] : 
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| > plot(pts, style = point, symbol = solidcircle, symbolsize = 14); 
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Interpretation of Results. 


The DDS shows that the drug reaches a value where change stops. The 
concentration of the drug in the bloodstream eventually levels at 400 mg. If 
400 mg is both a safe and effective drug level, then this dosage schedule is an 
acceptable treatment. Note that we’re looking at discrete points, but the drug 
concentration is a continuous quantity. What is the shape of the continuous 
curve modeling drug concentration? 

We discuss the concept of change stopping—equilibrium—later in this 
chapter. We used the Maple command limit to quickly determine long-term 
behavior of the DDS discovering the equilibrium. 


Example 2.4. The Time Value of Money. 

A bank customer wishes to purchase a $1000 savings certificate that pays 1.2% 
interest a year (APR) compounded monthly at 0.1% = 1.2%/12 per month. 
Why use a discrete model here? At our local financial institution, there is 
a sign that says that interest is compounded (and paid) at the end of each 
month based on the average monthly balance. Therefore, we conclude that a 
discrete model for interest is appropriate. 

Remark: Always divide the annual interest rate (APR) by the number of 
compounding periods per year to compute the actual interest rate being used. 
Here, the annual rate is 1.2% or 0.012 and interest is calculated and paid 
monthly (12 times per year). So, use 0.012/12 = 0.001 as the monthly interest 
rate. 


General Problem Statement. 
Find a relationship between the amount of money invested and the time 
over which it is invested. 
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Assumptions. 
The interest rate is constant over the entire time period. No additional 
money is added or withdrawn other than the interest. 


Variables. 
Let n = 1,2,3,... be number of months passed. 
Let a(n) = the amount of money in the certificate after month n. 


Flow Diagram: 
We diagram the flow in Figure 2.6. 


Monthly Interest Amount in the 
(0.1%) 


Account after 
n Months 





FIGURE 2.5: Flow Diagram for Money in a Certificate 


Model Construction. 


Since 


a(n +1) = the future—account balance at the end of month n + 1 
a(n) = the present—account balance at the end of month n 


0.001 - a(n) = the change—interest compounded at the end of month n 


Using our paradigm future = present + change, the worth of the certificate 
with interest accumulated each month is 


a(n +1) =a(n)+0.001la(n) forn=1,2,... 
=1.00la(n) forn=1,2,... 


The initial deposit of $1000 gives 
a(0) = 1000. 


Use the discrete time periods (1,2,3,...) to iterate as follows: 


a(0) = 1000 

a(1) = 1.001 - a(0) = 1001 

a(2) = 1.001 - a(1) = 1.001? - a(0) = 1002.001 
a(3) = 1.001 - a(2) = 1.001? - a(0) = 1003.003 


a(1) = r-a(0), a(2) = r? - a(0), a(3) = r° - a(0), a(4) = r4 - a(0), ..., 
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suggesting that 
a(n) =r”"-a(0) forn=0,1,2,... 


Let’s check with Maple. 


| > rsolve({a(n + 1) = 1.001 - a(n), a(0) = 1000}, a(n)) : 
Money := unapply(%, n); 


1001 \ ” 
i Money ++ 1000 (sr) 


Note that Maple converts to rational numbers when using rsolve. Recall that 
unapply turns an expression into a function. 
What is the overall trend? 


| > Limit(Money(n),n = infinity) = limit( Money(n),n = infinity); 


1001 \" 
lim 1 — ) = 
Jim 1000 (755) = °° 





> with( plots) : 
|> pointplot([seq([i, Money(i)], i = 0..12 - 4)], title = "Money over Time"); 


Money over Time 
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The graph looks like a line even though this function is obviously nonlinear. 
How many years does it take for the plot to show some curvature? 
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Let’s look at the sequence of 4 years of values to see how Money grows. 


| > Money_table := seq(evalf(Money(i)), i = 0..12 - 4); 


Money_table := 1000., 1001., 1002.001000, 1003.003001, 
1004.006004, 1005.010010, 1006.015020, 1007.021035, 
1008.028056, 1009.036084, 1010.045120, 1011.055165, 
1012.066220, 1013.078287, 1014.091365, 1015.105456, 
1016.120562, 1017.136682, 1018.153819, 1019.171973, 
1020.191145, 1021.211336, 1022.232547, 1023.254780, 
1024.278035, 1025.302313, 1026.327615, 1027.353943, 
1028.381297, 1029.409678, 1030.439088, 1031.469527, 
1032.500996, 1033.533497, 1034.567031, 1035.601598, 
1036.637199, 1037.673836, 1038.711510, 1039.750222, 
1040.789972, 1041.830762, 1042.872593, 1043.915465, 
1044.959381, 1046.004340, 1047.050345, 1048.097395, 

1049.145492 


Can the account ever reach $10,000? We can determine that it will take 
about 192 years to reach $10,000. 





[> months := fsolve(10000 = Money(n), n); 
months := 2303.736194 


months | 
12° 





> years := 





years := 191.9780162 


Since Money(n) goes to infinity as n grows, the account continues to grow as 
long as there are no withdrawals. 


Example 2.5. A Simple Mortgage. 

Five years ago, your parents purchased a home by financing $150,000 over 20 
years with an interest rate of 4% APR. Their monthly payments are $908.97. 
They have made 60 payments and wish to know what they actually owe on 
the house at this time. They can use this information to decide whether or 
not to refinance their house at a lower interest rate of 3.25% APR for the next 
15 years or 3.5% APR for 20 years. 


The change in the amount owed each period increases by the amount of 


the interest and decreases by the amount of the payment. 


General Problem. 

Build and use a model that relates the time to the amount owed on a 
mortgage for a home. 
Assumptions. 


e Initial interest was 4% APR. 
e The monthly interest rate is 4%/12 = 0.33%. 
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e Payments are made on time each month. 


e The current rates for refinancing are 3.25% for 15 years and 3.50% for 20 
years. 


Variables. 
Let b(n) = amount owed on the home after n months. 


Flow Diagram: 
We diagram the flow in Figure 2.6. 
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FIGURE 2.6: Flow Diagram for Amount Owed on Mortgage 


Model Construction. 


future = present + change 


b(n + 1) = b(n) + (0.04/12 - b(n) — 908.97) and (0) = 150000 
= (1 + 0.04/12) - b(n) — 908.97 and (0) = 150000 


Model Solution: 
Use rsolve to find a formula for b(n). Since Maple converts to rational 
numbers for rsolve, we’ll embed an evalf to return to decimals. 


> rsolve (fun 1) = (1 F “) o Hn) 


b := unapply(evalf(%),n); 
b := n> —122691.0273 - 1.003333333” + 272691.0273 


First, a reasonableness check. 


[> b(12 x 20); 
L 0.1695 
The last payment leaves 0.17 ¢ to pay; this is due to round-off error in the 
computations as 0.04/12 is a repeating decimal value. 

Let’s plot the DDS over the entire 20 years (240 months). Note the graph 
shows that the balance essentially reaches 0 in 240 months. 
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|> with(plots) : 
|MC'pts := [seq([t, b(i)], i = 0..240)] : 
pointplot(pts, title = “Mortgage Balance”); 
Mortgage Balance 
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Now let’s make a table of the mortgage balance up to month 60. 


|> Mortgage table := seq(b(i), i = 0..60); 

Mortgage _table := 150000.0000, 149591.0299, 149180.6967, 148768.9957, 
148355.9222, 147941.4719, 147525.6401, 147108.4222, 146689.8135, 
146269.8095, 145848.4056, 145425.5968, 145001.3787, 144575.7466, 
144148.6957, 143720.2214, 143290.3187, 142858.9831, 142426.2096, 
141991.9936, 141556.3303, 141119.2146, 140680.6419, 140240.6074, 
139799.1061, 1389356.1329, 138911.6833, 188465.7522, 138018.3347, 
137569.4258, 137119.0206, 136667.1138, 186213.7009, 135758.7764, 
135302.3357, 184844.3734, 134384.8846, 133923.8641, 133461.3071, 
132997.2080, 1382531.5619, 132064.3638, 131595.6083, 131125.2903, 
130653.4045, 1380179.9458, 129704.9090, 129228.2886, 128750.0796, 
128270.2764, 127788.8740, 127305.8668, 126821.2496, 126335.0171, 
125847.1639, 125357.6843, 124866.5733, 124373.8251, 123879.4345, 

123383.3959, 122885.7038 


From the table, we see the balance on the mortgage is b(60) = 122885.7038. 
After paying for 60 months, your parents still owe $122,885.70 of the origi- 
nal $150,000. They have paid a total of $54,538.20, but only $27,114.30 went 
towards the principal, the rest, $27,423.90, was interest. If the family con- 
tinues with this loan, they will make 240 payments of $908.97 for a total of 
$218,152.80. They would pay a total of $68,152.80 in interest. They’ve already 
paid $27,423.90 in interest, so they would pay an additional $41,038.50 in 
interest. Should they refinance? 
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The alternatives for refinancing were 3.25% for 15 years or 3.50% for 20 
years. Assuming no additional costs for refinancing, only securing a new mort- 
gage for what they currently owe, $122,885.70, what is the best choice? 

You will be asked to solve this problem in the exercise set. 








Exercises 


Iterate and graph the following DDSs. Explain their long-term behavior. For 
each DDS, find a realistic scenario that it might explain/model. 











1. a(n+1) = 0.5a(n)+ 0.1, a(0)= 0.1 

2. a(n + 1) = 0.5a(n)+ 0.1, a(0)= 0.2 

3. a(n + 1) = 0.5a(n)+ 0.1, a(0) = 0.3 

4. a(n + 1) = 1.01a(n) — 1000, a(0) = 90000 
5. a(n + 1) =1.01a(n)—1000, a(0) = 100000 
6. a(n +1) = 1.01a(n) — 1000, a(0) = 110000 
7. a(n +1) = —1.3a(n)+20, a(0)=9 

8. a(n +1) =—4a(n) +50, a(0)=9 

9. a(n +1) =0.la(n) +1000, a(0)=9 
10. a(n + 1) = 0.9987 a(n) + 0.335, a(0) = 72 
11. Consider the mortgage problem of Example 2.5 (pg. 49). Determine what 


the cost of each refinancing alternative is compared to the current mort- 
gage. Make a recommendation to your parents. 





a] 
2.4 Equilibrium Values and Long-Term Behavior 
Equilibrium Values 

Recall our original paradigm 


Future = Present + Change. 


When the DDS stops changing, change equals zero, and so future equals 
present, and remains there. A value for which change equals 0, if any exist, 
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is the equilibrium value. Change having stopped gives us a context for the 
concept of the equilibrium value. 

We will define the equilibrium value, also called fired point, as the value 
where change stops; i.e., where change equals zero. The value A is an equilib- 
rium value for the DDS 


a(n + 1) = f(a(n),a(n — 1),...,a(0)), 


if whenever for some N, we have a(N) = A and all future values a(n) = A for 
n>N. 
Formally, we define the equilibrium value as follows: 


Definition 2.1. Equilibrium Value. 

The number ev is called an equilibrium value or a fixed point for a discrete 
dynamical system a(n) if a(k) = ev for all values of k when the initial value 
a(0) is set to ev. That is, a(k) = ev is a constant solution to the recurrence 
equation for the dynamical system. 


Another way of characterizing such values is to note that a number ev 
is an equilibrium value for a dynamical system a(n + 1) = f(a(n),a(n — 
1),...,n) if and only if ev satisfies the equation ev = f(ev,ev,...,ev). This 
characterization is the genesis of the term fixed point. 

Definition 2.1 shows that a linear homogeneous dynamical system of order 
1 only has the value 0 as an equilibrium value. 

In general, dynamical systems may have no equilibrium values, a single 
equilibrium value, or multiple equilibrium values. Linear systems have unique 
equilibrium values. The more nonlinear a dynamical system is, the more equi- 
librium values it may have. 

Not all DDSs have equilibrium values, many DDSs have equilibrium values 
that the system will never achieve unless a(0) equals that value. 

The DDS from the Tower of Hanoi a(n+1) = 2a(n)+1 has an equilibrium 


value ev = —1. If we begin with a(0) = —1 and iterate, we get 
a(0) = —1, 
a(1) = 2a(0) +1 = —1, 
a(2) = 2a(1) +1 = —1, 
a(3) = 2a(2) +1 = —1, 





etc. (Note that the physical Tower of Hanoi puzzle cannot have a negative 
number of disks, so for the puzzle, —1 is an unreachable equilibrium value.) 
We can use the observation that ev = f(ev,ev,...,ev) to find equilibrium 
values, and to determine whether or not a given DDS has an equilibrium value. 
Look again at the DDS a(n + 1) = 2a(n) + 1. Substituting a(n + 1) = ev and 
a(n) = ev into our DDS yields 





ev = 2ev +1. 
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Solving for ev gives the equilibrium value for the DDS as ev = —1. 
Now, let’s consider the DDS a(n + 1) = a(n) + 1. Using our observation, 
we write 
ev=ev+t+l. 


This equation has no solution, so the DDS a(n +1) = 2a(n)+1 does not have 
any equilibrium values. 


Example 2.6. Finding Equilibrium Values. 
Consider the following four DDSs. Find their equilibrium value(s), if any exist. 





1. a(n + 1) =0.3.a(n) — 10 
2. a(n + 1) = 1.3a(n) + 20 
3. a(n +1) =0.5a(n) 

4. a(n +1) =—0.1a(n) +11 
Solutions. 


1. ev =0.3ev — 10 => ev = —10/0.7 ~ —14.286 
2. ev = 1.3ev + 20 = ev = —20/0.3 = —66.667 
3. ev = 0.5ev ev =0 








4. ev = —0.lev + 11 = ev= 11 


Above, we observed that DDSs that have equilibrium values may never 
attain their equilibrium value, given some specific initial condition a(0). In 
DDS 3. from Example 2.6 above, choose any initial value A not equal to 
0, the ev. Then, by iteration, a(k) = 27A Æ 0 for any k. This DDS can 
never achieve its equilibrium when starting from a nonzero value even though 
limno a(n) = 0, the equilibrium value. 

We found the equilibrium value was —1 for the DDS from the Tower of 
Hanoi puzzle. Suppose that we begin iterating the Tower’s DDS with an initial 
value a(0) = 0. Then 





a(1) = 2a(0)+1=3, 
a(2) = 2a(1)+1 =7, 
a(3) = 2a(2) +1 = 15, 
a(4) = 2a(3) + 1 = 31, 


etc. The values continue to get larger and will never reach the value of —1. 
Since a(n) represents the number of moves of the disks, the value of —1 makes 
no real sense in the context of the number of disks to move. 
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We will study equilibrium values in many of the applications of discrete 
dynamical systems. In general, a linear nonhomogeneous discrete dynamical 
system, where the nonhomogeneous part is a constant, will have an equilibrium 
value. Can you find any exceptions? A linear homogeneous discrete dynamical 
system will always have an equilibrium value of zero. Why? 








2.5 A Graphical Approach to Equilibrium Values 


We can examine a graph of the DDS’s iterations using Maple. If the values 
reach a specific value and remain constant, the graph levels, then that value 
is an equilibrium value (change has stopped). 

Dynamical systems often represent real-world behavior that we are trying 
to understand. We model reality to be able to predict future behavior and gain 
deeper insights into the behavior and how to influence or alter that behavior. 
Thus, we have great interest in the predictions of the model. Understanding 
how the model changes in the future is our goal. 


Models of the Form: a(n+ 1) = r -a(n)+b with Constant r and b 


Consider a savings account with 12% APR. interest compounded monthly 
where we deposit $1000 per month. Repeating our earlier analysis, but adding 
the monthly deposit. (Remember to start with a fresh Maple worksheet, then 
load the plots package.) 


12 
> DDS := a(n + 1) = (1 + =) - a(n) + 1000 : 
| > rsolve({DDS,a(0) = 0}, a(n)) : 
a := unapply(%, n); 


a —> 101000 (a 


100 





) — 100000 
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| > pts := [seq([k, a(k)], k = 0..24)] : 
pointplot(pts, title = “Savings Account with Monthly Deposit”); 
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The DDS a(n + 1) = 1.01 a(n) + 1000, with r = 1.01 > 1, grows without 
bound. The graph suggests that there is no equilibrium value—a value where 
the graph levels at a constant value. Analytically, we have ev = 1.0lev + 
1000 = > ev = —100000. There is an equilibrium value ev = —100000, but 
that value will never be reached in our savings account problem. Without 
withdrawals, the savings account can never be $100,000 in the hole! 

What happens if r is less than 0? 

Proceed as before, but with a negative r. The new DDS is 


a(n + 1) = —1.01 a(n) + 1000 
with a(0) = 0. Analytically solve for the equilibrium value. 


ev = —1.01 ev + 1000 
ev = 497.5124378 


The definition of equilibrium value implies that if we start at a(0) = 
497.5124378, we stay there forever. From the plot of the DDS in Figure 2.7, 
we note the oscillations between positive and negative numbers, each growing 
without bound as the oscillations fan out. Although there is an equilibrium 
value, the solution to our example does not tend toward this fixed point. 
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FIGURE 2.7: A Linear DDS with Negative r 


Let’s examine an example with a value of r between 0 and 1, that is, 
O<r<l. 

A drug concentration model with a constant dosage of 16 mg each time 
period (4 hours) has the DDS 


a(n + 1) = 0.5a(n) + 16 


with an initial dosage of a(0) applied prior to beginning the regime. Figure 2.8 
shows the DDS with 4 different initial values, a(0) = 10, 20, 32, and 50 mg. 
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FIGURE 2.8: Drug Concentration DDS with Four Different Initial Values 


Regardless of the starting value, each graph shows the future terms of a(n) 
approach 32. Thus, 32 is the equilibrium value. We could have easily solved 
for the equilibrium value algebraically as well. 


a(n+1)=05a(n)+16 —> ev=05ev+16 —> ev=32 


In general, finding the equilibrium value for this type of DDS requires 
solving the equation ev = r - ev + b for ev. We find 





when r £1. 


= l-r 
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Applying this formula to the drug concentration DDS aboves calculates the 


equilibrium value as 
16 


-1-05 
the value we observed from the graphs. 





ev 32, 


Stability and Long-Term Behavior 


For a dynamical system a(n) with a specific initial condition a(0) = Ap, we 
have shown that we can compute a(1), a(2), and so forth, by iteration. Often 
the particular values are not as important as the long-term behavior. By long- 
term behavior, we refer to what will eventually happen to a(n) for larger and 
larger values of n. There are many types of long-term behavior that can occur 
with DDSs; we will only discuss a few here. 

If the values of a(n) for a DDS eventually get close to the equilibrium 
value ev, for all initial conditions in a range, then the equilibrium is called a 
stable equilibrium or an attracting fixed point. 


Example 2.7. A Stable Equilibrium or Attracting Fixed Point. 
Consider the DDS a(n + 1) = 0.5a(n) + 64 with initial conditions either 
a(0) = 0, 50, 100, 150, or 200. In each case, the ev is 128; therefore, the 
ev = 128 is a stable equilibrium. Examine Figure 2.9. 
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FIGURE 2.9: Drug Concentration with Varying Initial Values 





5“Fventually get close to ev” can be defined formally as: For any positive tolerance e€, 
there is an integer N (e), such that whenever n > N, then |a(n) —ev| < e, i.e., a(n) is within 
c of ev whenever n > N. See, e.g., Bauldry Intro. to Real Analysis [Bauldry2009]. 


60 Discrete Dynamical Models 


Let’s use rsolve to find a general formula, and then generate a table of 
values for the DDS. 


| > rsolve({a(n + 1) = 0.5 + a(n) + 64, a(0) = A}, a(n)); 
a := unapply(%,[A,n]) : 


1 n 1 n 
Al) =T28 [= 128 
| (3) -8 (3) + 
[> inits := [0, 50, 100, 150, 200] : 


[> gen := (i, j) + evalf ,(alinits;, i))); 
DrugConc Table := Matriz(10, 5, gen); 
64.0 89.0 114.0 139.0 164.0 
96.0 108.5 121.0 133.5 146.0 
112.0 118.2 124.5 130.8 137.0 
120.0 123.1 126.2 129.4 132.5 
124.0 125.6 127.1 128.7 130.2 
126.0 126.8 127.6 128.3 129.1 
127.0 127.4 127.8 128.2 128.6 
127.5 127.7 127.9 128.1 128.3 
127.8 127.8 127.9 128.0 128.1 
127.9 127.9 128.0 128.0 128.1 


Notice that all the sequences are converging to 128, the attracting fixed point. 
(To see more than 10 rows, first execute interface(verboseproc=100).) 


DrugConcTable := 








Example 2.8. A Financial Model with an Unstable Equilibrium. 
Consider a financial model where $100 is deposited each month in an account 
that pays 12% APR compounded monthly. The DDS is 


B(n+1) = (1 + >) B(n) +100, B(0) = 100. 


Let’s first determine a formula for B(n) 


E 0.12 
> rsolve({B(n +1) = (1 + >) - B(n) + 100, B(0) = 100}, B(n)) : 
B := unapply(%, n); 
101\” 
B := n =œ 10100- | — } — 10000 
a (=) 


The algebraic technique easily gives us the equilibrium value. 


[> ev := solve(ev = 1.01 - ev + 100, ev); 

L ev := —10000. 

The ev value is —10000. If the DDS ever achieves an input of —10000, then 
the system stays at —10000 forever. Since $100 is deposited each month and 
there are no withdrawals, the account must stay positive—we can’t reach the 
equilibrium value! We say this equilibrium value is unstable or a repelling 
fixed point. 
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Figure 2.10 shows a graph of our financial model and ev. 
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FIGURE 2.10: Unstable Equilibrium Value 


Generate a table of values. 


[> ((seq(i,i = 0..9)) | (seq(evalf g 


B(i)), i =0..9))); 
100.0 
201.0 

303.010 
406.040 
510.101 
615.202 
721.354 
828.567 
936.853 

1046.22 


The values increase over time and never move toward —10000. Therefore, the 
ev is unstable or repelling. 

Try this model with different initial values and different deposit values. 
Can the equilibrium become attracting? 

Often, we characterize the long-term behavior of the system in terms of its 
stability. If a DDS has an equilibrium value, and if the solution tends to the 
equilibrium value from starting values near the equilibrium value, then the 
DDS is said to be stable. We summarize the results for the dynamical system 
a(n +1) =r- a(n) +6, where b 4 0 in Table 2.4. 
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TABLE 2.4: Linear Nonhomogeneous Discrete Dynamical Systems 


Value ofr Equilibrium Solution Stability | Long-Term Behavior 





o 











r<-—l I Unstable Unbounded Oscillations 
-r 
b Oscillates between 
r=-l 3 Unstable a(0) and b— a(0) 
-l<r<0 ? Stable PANA N 
l-r equilibrium 
r=0 b Stable Constant 
b a 
jaiei Stable Goes to equilibrium 
l-r from a(0) 
r=1 None Unstable Unbounded 
(Goes to sgn(b) - 00) 
l<r I 4 Unstable Unbounded 
-r 


From the table we see that the DDS a(n+1) = r-a(n)+b has an equilibrium 
b/(1 —1r) whenever r 4 1 and has no equilibrium when r = 1. Further, we see 
that the equilibrium is stable only when |r| < 1. 


Relationship of the Equilibrium to Analytical Solutions 


If a discrete dynamical system has an equilibrium ev, we can use the value to 
find the analytical solution. 

Consider the DDS for a 6.5% APR mortgage having monthly payments of 
$639.34: 


B(n +1) = 1.00541667B(n) — 639.34, B(0) = 73,395 


The ev value is analytically found to be 118031.9927. Substituting the ev 
and r values into B(k) = r*C + ev and then setting k = 0 gives 


B(0) = 1.00541667°C + 118031.9927 = 73395 


which yields 
C = —44636.99270 


The solution to our DDS is 
B(n) = —44636.99270 - 1.005416667” + 118031.9927 
(Check this with rsolve!) 
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We can also use this method to determine the mortgage payment. The 
DDS is now 


B(n +1) = 1.00541667 B(n) — P, B(0) = 73395 
with solution 
B(n) = 1.00541667” c + ev. 
Build a system of two equations and two unknowns. 


73395 =c+d (from B(0) = 73395) 
0 = 1.0054166718° c + d (from B(180) = 0) 


Solve the system. 
P solve ({73395 = c + d, 0 = 1.00541667180c + d} {c, d}) : 


{c = —44638.66498, d = 118033.6650} 


Thus 
B(n) = —44638.66498 - 1.005416667” + 118033.6650. 


(Note: there is a little round-off error creeping into the calculations.) 
To find the payment P, we know that 
ev = 1.005416667 ev — P. 
Substitute ev = 118033.6650 and solve to find 
P = 639.3494. 


This value agrees with our original payment other than some round-off error. 
Round-off in financial calculations can be very serious°—always be careful! 


Equilibrium Values and the Limit 


Studying the behavior of equilibrium values relies fundamentally on the con- 
cept of limit. Elementary calculus is based on limit and includes a careful study 
of the concept. Advanced calculus or real analysis presents a precise, carefully 
crafted definition, but using that would overly complicate our investigation of 
equilibrium values. The informal definition of limit of a sequence 


jim a(k) = L if and only if for n large enough a(n) must be close to L.” 
— 00 


will serve our needs and keep us focused on the behavior of the discrete dynam- 
ical system in question. 





6See Whiteside’s Computer capers: Tales of electronic thievery, embezzlement, and 
fraud, Crowell, New York, c1978. 

TA more formal definition that specifies large enough and close to precisely is: lim, a(k) = 
L iff given any tolerance € > 0, there is a positive integer N* such that whenever n > N%*, 
then |a(n) — L| < e. See, e.g., Bauldry Intro. to Real Analysis [Bauldry2009]. 
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We are interested in what happens to a(k) as k gets larger and larger 
without bound. (In other words, when we iterate many times.) It may happen 
that for large values of k, a(k) is close or even equal to some number L. If 
increasing k (doing more iterations) causes a(k) to get even closer to L until 
for very large values of k, a(k) is essentially equal to L, then L is the limit 
of a(k). If a(k) has a limit, then we also say a(k) converges to L or simply 
a(k) converges. It is important to understand that if a(k) converges to L, then 
further increases in k will never cause a(k) to move “too far” away from L. 
Also remember that we are looking at large values of k. How large is large 
depends upon the DDS. Compare how large k must be for the two sequences 
DDS): a(k) = e~* and DDS»: b(k) = 1/(In(k) + 1). 

If increasing k causes a(k) to continue to increase or to oscillate, and a(k) 
does not begin to converge to a number L, we say a(k) is diverging and the 
limit does not exist. 


Example 2.9. Convergent Difference Equations. 
Consider 


1. the difference equation a(n + 1) = 0.9a(n) + 2 with a(0) = 1. Iterating 
with an accuracy of 6 decimal places produces the following: 


a(1) = 2.9 


a(150) = 19.999997 
a(151) = 19.999998 


a(156) = 19.999999 


a(166) = 20.000000 


and for all k > 166, a(k) = 20. Thus, for this difference equation, the limit 
of a(k) is 20. 


2. the difference equation b(n + 1) = —0.1 b(n) + 11 with b(0) = 1. Iterating 
with an accuracy of 6 decimal places, we see the following: 


b(4) = 9.999100 
b(5) = 10.000090 
b(6) = 9.999991 
b(7) = 10.000001 
b(8) = 9.999999 
b(9) = 10.000000 
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and for all k > 9, b(k) = 10. Thus, for this difference equation, the limit 
of b(k) is 10. 

Even though 0(4) is less than 10 and b(5) is greater than 10, b(5) is closer 
to 10 than b(4) is. Increasing k caused a(k) to get closer to 10. Thus, the 
limit as k + oo of b(k) is 10. Graph it! 


If a(k) does not converge to L, then the sequence diverges. There are several 
ways in which a sequence may diverge. If a(k) gets infinitely large as k gets 
infinitely large, then a(k) diverges. If a(k) gets infinitely large in the negative 
direction as k gets infinitely large, then a(k) also diverges. If a(k) oscillates 
between large positive and large negative values, always getting further from 
0 as k gets infinitely large, then a(k) diverges. If a(k) oscillates in a pattern 
between two or more fixed values as k gets infinitely large, a(k) diverges. If 
a(k) shows absolutely no pattern of behavior as k gets infinitely large, then 
a(k) diverges. Given a random sequence a(k), it will likely diverge. 


Example 2.10. Divergent Difference Equations. 
Consider 


1. the difference equation a(n + 1) = 4a(n) + 2 with a(0) = 1. Iteration of 
this difference equation produces 


a(10) = 1747626 
a(11) = 6990506 
a(12) = 27962026 
a(13) = 111848106 


Further increase in k causes increase in a(k). Thus, a(k) diverges. 


2. the difference equation a(n + 1) = —4a(n) +2 with a(0) = 1. In this case, 
iteration produces 


a(10) = 629146 
a(11) = —2516582 
a(12) = 1006630 
a(13) = —40265318 


As k gets larger and larger, a(k) gets further from 0, always oscillating 
between positive and negative values. Thus, a(k) diverges. 


We close this section with a very useful theorem. 


Theorem. Limits are Stable Equilibria. 
If a dynamical system has a limit, then that limit is a stable equilibrium value; 
i.e., is an attracting fixed point. 
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Exercises 


1. For each of the following DDSs, find the equilibrium value(s), if any exist. 
Classify the DDS as stable or unstable. 








(a) a(n + 1) = 1.23a(n) 

(b) a(n + 1) = 0.99a(n) 

(c) a(n + 1) = —0.8a(n) 

(d) a(n+1) =a(n) +1 

(e) a(n + 1) = 0.75a(n) + 21 
(£) a(n + 1) = 0.80a(n) + 100 
(g) a(n + 1) = 0.80a(n) — 100 
(h) a(n + 1) = —0.80a(n) + 100 


2. Build a numerical table for the following DDSs. Observe the patterns and 
provide information on equilibrium values and stability. 


(a) a(n +1) = 1.la(n) + 50 with a(0) = 1010 

(b) a(n + 1) = 0.85a(n) + 100 with a(0) = 10 

(c) a(n + 1) = 0.75a(n) — 100 with a(0) = —25 
) a(n +1) 





a = a(n) + 100 with a(0) = 500 


E: SeSe ee 


2.6 Modeling Nonlinear Discrete Dynamical Systems 


In this section, we build nonlinear DDS to describe the change in behavior of 
the quantities we study. We also will study systems of DDS to describe the 
changes in various quantities that act together in some way or influence each 
other in deterministic ways. We define a nonlinear DDS by 


If the DDS a(n + 1) = f(a(n),...,n) involves powers of a(k) [such as 
a*(k)], or a functional relationship [such as a(k)/a(k—1) or sin(a(k))], 
we say the discrete dynamical system is nonlinear. 


We will restrict our investigations to numerical and graphical solutions of 
nonlinear models. Analytical solutions of nonlinear models are studied in more 
advanced mathematics courses. 

We often model population growth by assuming that the change in pop- 
ulation is directly proportional to the current size of the given population. 
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This produces a simple, first-order DDS similar to those seen earlier. It might 
appear reasonable at first examination, but the long-term behavior of growth 
without bound is disturbing. Why would growth without bound of a yeast 
culture in a jar (or confined space) be alarming?® 

There are certain factors that affect population growth. These factors 
include resources: food, oxygen, space, etc. The available resources can sup- 
port some maximum population. As this number is approached, the change, 
or growth rate, should decrease and the population should never exceed its 
resource supported amount. If the population does exceed the maximum sup- 
ported amount, the growth should become negative. 


Example 2.11. Growth of a Cancer Culture. 
Problem. 

To determine whether our treatments are successful, we need to predict 
the growth of cancer cells in a controlled environment as a function of the 
resources available and the current cell ‘population’ 


Assumptions and Variables. 

We assume that the population size is best described by the weight of the 
biomass of the culture. Define y(n) as the population size of the cell culture 
after period n. There exists a maximum carrying capacity M that is sustain- 
able by the resources available. The culture is growing under the conditions 
established. 


Model. 


n: the time period measured in hours 
y(n): the population size after period n 
M : the carrying capacity of our system 
y(n + 1) = y(n) + ky(n) (M — y(n)) 


where k is the constant of proportionality (a function of growth rate). 

Using the data collected in our experiment, we first plot y(n) versus n and 
find a stable ev of approximately 670. Next, we plot y(n + 1) — y(n) versus 
y(n) (670 — y(n)) to find the slope, which gives k, is approximately 0.00090. 
With k = 0.00090 and carrying capacity in biomass is 670. Our specific model? 
is then 

y(n + 1) = y(n) + 0.0009 y(n) (670 — y(n)) 


Again, this is nonlinear because of the y?(n) term (after expanding the 
right side of the model). There is no closed-form analytical solution for this 
model. The numerical solution iterated in Maple from an initial condition of 
biomass = 30.0 follows. 





8See Rev. Malthus’ 1798 book An Essay on the Principle of Population [Malthus1798]. 
°For an approach to fitting this type of model to data, see Bauldry, “Fitting Logistics 
to the U.S. Population,” MapleTech, 4(3), 73-77. 
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[> k, M, init := 0.0009, 670, 30.0: 


[> biomass := proc(n :: integer) 

option remember; 

piecewise(n > 0, 
biomass(n — 1) + k - biomass(n — 1) - (M — biomass(n — 1)), 
init); 

end proc: 





We use option remember in the procedure to ‘remember’ (store) values of 
biomass as they are computed in order to dramatically reduce the number of 
calculations needed. Try it without the option! 


| > pts := [seq([n, biomass(n)],n = 0..30)] : 


| > pointplot(pts, view = [0..30, 0..700}, title = “Biomass”); 
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The model shows stability in that the population (biomass) of the cell cul- 
ture quickly approaches 670 as n gets large. Thus, the population is eventually 
stable at approximately 670 units. We would then proceed with treatments, 
collect new data, and evaluate the new trend of the biomass to determine 
whether the treatment is successful or not. 


Example 2.12. Spread of a Contagious Disease. 

There are 1000 students in a college dormitory. Several students have returned 
from a Spring Break trip where they were exposed to mumps, a highly con- 
tagious disease. The Health Center wants to build a model to determine how 
fast the disease will spread in order to determine effective vaccination and 
quarantine procedures. 
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Problem. 
Predict the number of students affected with mumps as a function of time. 


Assumptions and Variables. 

We assume all students are susceptible to the disease. The possible con- 
tacts of infected and susceptible students are proportional to their product 
(as an interaction term). Let m(n) be the number of students affected with 
mumps after n days. 


Model. 
m(n + 1) = m(n) + k- m(n) - (1000 — m(n)) 
Two students have come down with mumps. The rate of spreading per 


day is characterized by k = 0.0005. A vaccine can be delivered and students 
vaccinated within 1-2 weeks. Hence 


m(n + 1) = m(n) + 0.0005m(n) - (1000 — m(n)) 


This model matches the cancer model above. We modify its Maple code 
accordingly. 


[> k, M, init := 0.0005, 1000, 2 : 


| > Mumps := proc(n :: integer) 
option remember; 
piecewise(n > 0, 
Mumps(n — 1) + k - Mumps(n — 1) - (M — Mumps(n — 1)), init); 
end proc: 
| > pts := [seq([n, round(Mumps(n))|,n = 0..30)| : 





We rounded the value of Mumps as it represents an integer: the number of 
infected students. 
Graph the points to see the trend of the disease. 
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| > pointplot(pts, title = “Mumps Infections”); 
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By 21 days, essentially all the students are infected. Let’s look at the weekly 
counts. 


| > seq(round(Mumps(k)), k in [7, 14, 21, 28]) 
33, 413, 977, 1000 





Interpretation. 

The results show that essentially all the students will be infected by 3 
weeks. Since only about 3% will be infected within one week, every effort 
must be made to get the vaccine to the school and to vaccinate the students 
within one week. 

We can model the effect of the vaccine on the spread of mumps by reducing 
k in the model and using Mumps(7) as the initial value of the new curve. 





Exercises 


Iterate and graph the following DDSs. Explain their long-term behavior. For 
each DDS, find a realistic scenario that it might explain/model. 


1. Consider the model a(n+1) = r-a(n)(1—a(n)). Let a(0) = 0.2. Determine 
the numerical and graphical solutions for the following values of r. Find 
the pattern in the solution. 

(a) r=2 
(b) r=3 
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(c) r=3.6 
(d) r=3.7 
2. Find the equilibrium value of the given DDS by iteration. Determine if 
the equilibrium value is stable or unstable. 
(a) a(n +1) =1.7a(n) — 0.14a(n)? 
(b) a(n +1) =0.8a(n) + 0.1a(n)? 
(c) a(n+1) =0.2a(n) — 0.2a(n)3 
(d) a(n+1) =0.1a(n)? + 0.9a(n) — 0.2 


a 





a 


3. A rumor spreads through a company of 1000 employees all working in the 
same building. We assume that the spread of a rumor is similar to the 
spread of a contagious disease in that the number of people hearing the 
rumor each day is proportional to the product of the number hearing the 
rumor and the number who have not yet heard the rumor. This model is 
given by 

r(n +1) =r(n) +1000k-r(n) — k - r(n)? 
where k is the parameter that depends on how fast the rumor spreads. 


Assume k = 0.001. Further assume that 4 people initially know the rumor. 
How soon will everyone know the rumor? 








Projects 


Project 2.1. Consider the highly contagious and deadly Ebola virus, which 
in 2018 appeared to be spreading again. Determine how deadly this virus 
actually is. (Visit https://www.cdc.gov/vhf/ebola/index.html.) Consider an 
animal research laboratory in Restin, VA (pop. ~ 58,000), a suburb of Wash- 
ington, DC, (pop. ~ 602,000). A monkey carrying the Ebola virus has escaped 
its cage and infected one employee (unknown at the time) during its escape 
from the research laboratory. This employee reports to University hospital 
later with Ebola symptoms. The Centers for Disease Control and Prevention 
(CDC) in Atlanta gets a call and begins to model the spread of the disease. 
Build a model for the CDC with the following growth rates to determine the 
number infected after 2 weeks: 


(a) k = 0.00025 
(b) k = 0.000025 
(c) k = 0.00005 
(d) k = 0.000009 


List some possible ways of controlling the spread of the virus. 
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Project 2.2. A rumor concerning termination spreads among 10,000 employ- 
ees of a major company. Assume that the spreading of a rumor is similar to 
the spread of contagious disease in that the number hearing the rumor each 
day is proportional to the product of those who have heard the rumor and 
those who have not yet heard the rumor. Build a model for the company with 
the following rumor growth rates to determine the number having heard the 
rumor after 1 week: 


(a) k = 0.25 
(b) k = 0.025 
(c) k = 0.0025 


(d) k = 0.00025 


List some possible ways of controlling the spread of the rumor. 
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2.7 Systems of Discrete Dynamical Systems 


In this section, we examine models of systems of DDS. For selected initial 
conditions, we’ll build numerical solutions to get a sense of long-term behavior 
of the system. We’ll find the equilibrium values of the systems we study. We’ll 
then explore starting values near the equilibrium values to see if by starting 
close to an equilibrium value, the system will: 


a. Remain close to the equilibrium value 
b. Approach the equilibrium value 
c. Move away from the equilibrium value 


What happens near the equilibrium values gives great insight into the long- 
term behavior of the system. We can study the resulting numerical solutions 
for patterns. 


Example 2.13. Location Merchants Choose. 

Consider an attempt to revitalize the downtown section of a small city with 
merchants. There are some merchants downtown, and others in the large city 
shopping plaza. Suppose historical records showed that 60% of the downtown 
merchants remain downtown, while 40% move to the shopping plaza. We find 
that 70% of the plaza merchants want to remain in the plaza, but 30% want to 
move downtown. Build a model to determine the long-term behavior of these 
merchants based upon the historical data. See Figure 2.11. There are initially 
100 merchants in the plaza and 150 merchants downtown. We seek to find the 
long-term behavior of this system. 
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Downtown Plaza 
Merchants Merchants 


FIGURE 2.11: Diagram of Merchant Movement 
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Problem. 
Determine the behavior of the merchants over time to see whether the 
downtown will survive. 


Assumptions and Variables. 
Let n represent the number of months. We define 


D(n) = number of merchants downtown at the end of the nth month 
P(n) = number of merchants at the plaza at the end of the nth month 


We assume that no incentives are given to the merchants for either staying or 
moving. 


The Model. 

The number of merchants downtown in any time period is equal to the 
number of downtown merchants that remain downtown plus the number of 
plaza merchants that relocate downtown. The same is true for the number of 
plaza merchants: the number is equal to the number that remain in the plaza 
plus the number of downtown merchants that move to the plaza. Mathemat- 
ically, we write the model as the system 


d(n + 1) = 0.60d(n) + 0.30p(n) 
p(n + 1) = 0.40d(n) + 0.70p(n) 


with d(0) = 150 and p(0) = 100 merchants, respectively. 
Let’s use Maple to explore this system. 








[> d:= proc(n) 
option remember; 
if n <1 then 150 else 0.60- d(n — 1) + 0.30 - p(n — 1) end if: 
end proc: 

p := proc(n) 
option remember; 
if n <1 then 100 else 0.40- d(n — 1) + 0.70 - p(n — 1) end if: 
end proc: 
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Numerically, we have: 


0 
1 
2 
3 
4 
Pts := 
5 
6 
7 
8 
9 





And now, graphically: 


|> Pts := Matriz(|seq(|k, d(k), 
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p(k)], k = 0..9))); 

150 100 
120.0 130.0 
111.0 139.0 

108.300000 141.700000 
107.4900000 142.5100000 


107.2470000 
107.1741000 
107.1522300 
107.1456690 
107.1437007 


142.7530000 
142.8259000 
142.8477700 
142.8543310 
142.8562993 


|> gl := pointplot(Pts|.., [1,2]], color = red) : 
| g2 := pointplot(Pts|.., [1,3]], color = blue) : 
| > display(g1, g2, legend = [evaln(d(n)), evaln(p(n))]); 
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Analytically, we can solve for the equilibrium values. Let X = d(n) and 
Y = p(n). Then, from the DDS, we have 


X =0.6X + 0.3Y 
Y =0.4X + 0.7Y 





However, both equations reduce to X = 0.75Y. 
There are 2 unknowns, so we need a second equation. From the initial 
conditions, we know that X + Y = 250. Use the equations 


X =0.75Y 
X +Y = 250 


to find the equilibrium values 
X = 107.1428571 and Y = 142.85714329 


Iterating from near these values, we find the sequences (quickly) tend toward 
the equilibrium. We conclude the system has stable equilibrium values. 
Change the initial conditions and see what behavior follows! 


Interpretation. 

The long-term behavior shows that eventually (without other influences) 
of the 250 merchants, about 107 merchants will be in the plaza and about 
143 will be downtown. We might want to try to attract new businesses to the 
community by adding incentives for operating either in the downtown busi- 
ness area or in the shopping plaza. 


Competitive hunter models involve species vying for the same resources 
(such as food or living space) in their habitat. The effect of the presence of a 
second species diminishes the growth rate of the first and vice versa.'° 

Let’s consider a specific example with lake trout and bass in a small lake. 


Example 2.14. Competitive Hunter Model!'. 

Hugh Ketum owns a small lake in which he stocks fish with the eventual goal 
of allowing fishing. He decided to stock both bass and lake trout. The Fish 
and Game Warden tells Hugh that after inspecting his lake for environmental 
conditions he has a solid base for growth of his fish. In isolation, bass grow 
at a rate of 20% and trout at a rate of 30% given an abundance of food. 
The warden tells Hugh that the species’ interaction for the food affects the 
trout more than the bass. They estimate the interaction term affecting bass is 
0.0010 - bass - trout, and for trout it is 0.0020 - bass - trout. Assume no changes 
in the habitat occur. 





10See “Bass are bad news for lake trout” from the online version of nature, International 
journal of science [Lawrence1999}. 

11 This example comes from a problem developed by Dr. Rich West, Professor Emeritus, 
Francis Marion University. 
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Model. 
Define the following: 


B(n) = the number of bass in the lake after period n 
T(n) = the number of lake trout in the lake after period n 
B(n)-T(n) = interaction of the two species 


B(n + 1) = 1.20B(n) — 0.0010 B(n) - T(n) 
T(n + 1) = 1.30T (n) — 0.0020 B(n) - T(n) 





The equilibrium values can be found by substituting B(n) = x and T(n) = 
y, then solving for x and y. We have 


x = 1.2x — 0.001 xy 
y = 1.3y — 0.002xy 


We rewrite these equations as 


0.2x — 0.001lay = 0 _, 7? (0.2 — 0.001y) = 0 
0.3y — 0.002£y = 0 y (0.3 — 0.002y) = 0 


The solution is (x = 0 or y = 200) and (y = 0 or x = 150) which gives the 
equilibrium values as (0,0) and (150, 200). 

Our next task is to investigate the long-term behavior of the system and 
the stability of the equilibrium points. 

Hugh initially considered stocking 151 bass and 199 trout in his lake. The 
solution is left to the student as an exercise. (Don’t forget to use option 
remember in your programs for bass and trout.) From Hugh’s initial condi- 
tions, bass will grow without bound and trout will die out (T (29) = 0). This 
is certainly not what Hugh had in mind. 


Example 2.15. Fast Food Tendencies. 

The Student Union of a university that has 14,000 students plans to have 
fast food chains available serving burgers, tacos, and pizza. The chains com- 
missioned a survey of students finding the following information concerning 
lunch: 75% of those who ate burgers will eat burgers again at the next lunch, 
5% will eat tacos next, and 20% will eat pizza next. Of those who ate tacos 
last, 20% will eat burgers next, 60% will stay will tacos, and 35% will switch 
to pizza. Of those who ate pizza, 40% will eat burgers next, 20% tacos, and 
40% will stay with pizza. See Figure 2.12. 
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75% 





Burgers 


60% 40% 


FIGURE 2.12: Diagram of Fast Food Movement 


Model. 
We formulate the problem as follows. Let n represent the nth day’s lunch, 
and define 


B(n) = the number of burger eaters in the nth lunch period 
T(n) = the number of taco eaters in the nth lunch period 


P(n) = the number of pizza eaters in the nth lunch period 


Using the values in the problem and the diagram leads us to the discrete 
dynamical system 

B(n +1) =0.75B(n) + 0.20T (n) + 0.40P(n) 

T(n +1) = 0.05B(n) + 0.607 (n) + 0.20P(n) 

P(n+1) = 0.20B(n) + 0.20T(n) + 0.40P(n) 











The same analytic technique that we used in the bass-lake trout example 
lets us find any equilibria for our fast food problem. Substitute B = x, T = y, 
and P = z in the DDS, then solve. Thus 





x = 0.75a + 0.20y + 0.40z 20 x 7. 
y = 0.052 + 0.60y + 0.202 => f= u= =a) 
z = 0.20x + 0.20y + 0.40z 
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Since the university has 14,000 students, then x + y+ z = 14000. Substitute 
the result above into this equation. 


ai + S42 = 14000 => z = 3500 
The equilibrium value is then (B, T, P) = (7777.8, 2722.2, 3500). 

The campus has 14,000 students who eat lunch. The graphical results in 
Figure 2.13 also show that an equilibrium value is reached at a value of about 
7778 burger eaters, 2722 taco eaters, and 3500 pizza eaters. This information 
allows the fast food establishments to plan for a projected future. By vary- 
ing the initial conditions, the initial numbers of who eats where, for 14,000 
students we find that these are stable equilibrium values. (Do this!) 
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FIGURE 2.13: Graphical Results of Burgers, Tacos, and Pizza 


Note the equilibrium value 7778 does not indicate that the same people 
always eat burgers, etc., but rather that the same number of people choose 
burgers, etc. 








Exercises 


Iterate and graph the following DDSs. Explain their long-term behavior. For 
each DDS, find a realistic scenario that it might explain/model. 


1. What happens to the merchant problem if 200 were initially in the shop- 
ping plaza and 50 were in the downtown portion? 
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2. Determine the equilibrium values of the bass and lake trout. Can these 
levels ever be achieved and maintained? Explain. 


3. Test the fast food models with different starting conditions summing to 
14,000 students. What happens? Obtain a graphical output and analyze 
the graph in terms of long-term behavior. 


E: SeSe e 


Projects 


Project 2.1. Small Birds and Osprey Hawks. 
Problem. 
Predict the number of small birds and osprey hawks in the same environment 
as a function of time. Osprey hawks will eat small birds when fish aren’t 
readily available along the coast. The coefficients given below are assumed to 
be accurate. 
Variables. 

B(n) = number of small birds at the end of period n 


H(n) = number of hawks at the end of period n 
Model. 
B(n +1) = 1.2 B(n) — 0.001 B(n)H (n) 
H(n +1) = 1.3 H(n) — 0.002 H(n)B(n) 





(a) Find the equilibrium values of the system. 


(b) Iterate the system from the initial conditions given in Table 2.5 and 
determine what happens to the hawks and small birds in the long term. 


TABLE 2.5: Initial Conditions for Small Birds and Hawks 


Small Birds Hawks 





150 200 
151 199 
149 210 
10 10 
100 100 


Project 2.2. Winning at Racket-Ball. 
Rickey and Grant play racket-ball very often and are very competitive. Their 
racket-ball match consists of two games. When Rickey wins the first game, 
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he wins the second game 75% of the time. When Grant wins the first game, 
he wins the second only 55% of the time. Diagram the ‘movement.’ Model 
this situation as a DDS and determine the long-term percentages of their 
racket-ball game wins. What assumptions are necessary? 


Project 2.3. Voter Distribution. 

It is getting close to election day. The influence of the new Independent Party 
is of concern to both the Republicans and Democrats. Assume that in the next 
election that 79% of those who vote Republican vote Republican again, 1% 
vote Democratic, and 20% vote Independent. Of those that voted Democratic 
before, 5% vote Republican, 70% vote Democratic again, and 20% vote Inde- 
pendent. Of those who previously voted Independent, 35% will vote Republi- 
can, 20% will vote Democratic, and 45% will vote Independent again. 


(a) Formulate the discrete dynamical system that models this situation. 


(b) Assume that there are 399,998 voters initially in the system. How many 
will vote Republican, Democratic, and Independent in the long run? 
(Hint: you can break down the 399,998 voters in any manner that you 
desire as initial conditions.) 


(c) NEW SCENARIO. In addition to the above, the community is growing: 
18-year-olds+new people moving in—deaths—current people moving out. 


Republicans predict a gain of 2,000 voters between elections. Democrats 
also estimate a gain of 2,000 voters between elections. The Independents 
estimate a gain of 1,000 voters between elections. If this rate of growth 
continues, what will be the long-term distribution of the voters? 








2.8 Case Studies: Predator—Prey Model, SIR Model, 
and Military Models 


A Predator-Prey Model: Foxes and Rabbits 


In the study of the dynamics of a single population, we typically take into 
consideration such factors as the “natural” growth rate and the “carrying 
capacity” of the environment. Mathematical ecology often studies populations 
that interact, thereby affecting each other’s growth rates. In this Case Study, 
we investigate a very special case of interaction, in which there are exactly 
two species: a predator which eats a prey. Such pairs exist throughout nature: 


e lions and gazelles, 


e birds and insects, 


Case Studies: Predator—Prey Model, SIR Model, and Military Models 81 


e pandas and bamboo, 
e Venus Flytraps and insects. 


Let x(n) be the size of the prey population and y(n) be the size of the predator 
population at time period n. 

To keep our model simple, we will make some assumptions that would be 
unrealistic in most predator-prey situations. Specifically, we will assume that: 


1. the predator species is totally dependent on a single prey species as its 
only food supply, (e.g., koalas and eucalyptus trees) 


2. the prey species has an unlimited food supply, and 
3. there are no other threats to the prey, just the specific predator. 


We will use the Lotka-Volterra model.!? If there were no predators, the 
second assumption would imply that the prey species grows exponentially 
without bound, then z(n + 1) = a - x(n). 

But there are predators, which must account for a negative component in 
the prey growth rate. The crucial additional assumptions about predator-prey 
interactions for developing the model are: 


1. The rate at which predators encounter prey is jointly proportional to the 
sizes of the two populations. 


2. A fixed proportion of encounters between predator and prey lead to the 
death of the prey. 


These assumptions lead to the conclusion that the negative component of the 
prey growth rate is proportional to the product xy of the population sizes. 
Putting these factors together yields 


x(n +1) = a(n) + a2z(n) — ba(n)- y(n) 


Now we consider the predator population. If there were no food supply, 
the predators would die out at a rate proportional to its size; i.e., we would 
find y(n + 1) = —cy(n). 

We assume that the “natural growth rate” is a composite of birth and 
death rates, both presumably proportional to the current population size. 
In the absence of food, there is no energy supply to support the birth rate. 
But there is a food supply: the prey. And what’s bad for foxes is good for 
rabbits. That is, the energy to support growth of the predator population is 
proportional to deaths of prey, so 


y(n +1) = y(n) — cy(n) + p2(n) - y(n) 





!2Lotka and Volterra independently developed this model. See Lotka’s Elements of phys- 
ical biology, Williams & Wilkins Co. 1925, and Volterra’s “Variazioni e fluttuazioni del 
numero d’individui in specie animali conviventi,” Mem. R. Accad. Naz. dei Lincei, Ser. VI, 
vol 2, 1926. 
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Put all the above together to have the discrete version of the Lotka- Volterra 
Predator-Prey Model 
x(n +1) = (1 + a)a(n) — ba(n)y(n) 


for n = 0,1,2,..., (2.1) 
y(n + 1) = (1 — c)y(n) + px(n)y(n) 





with (x(0), y(0)) = (£0, yo) and where a, b, c, and p are positive constants. 
The continuous Lotka-Volterra model, analogous to (2.1), consists of the 
system of linked differential equations 


a'(t) = +a x(t) — B x(t)y(t) 
y' (t) = =y y(t) + px(t)y(t) 


that cannot be separated from each other and that cannot be solved in closed 
form. Nevertheless, the continuous model, just like the discrete, can be solved 
numerically and graphed in order to obtain insights about the system being 
studied. 

Let’s model an isolated population of foxes and hares on a small island 
with a discrete Lotka-Volterra system. Data collected in the field has yielded 
the estimates 


for t > 0, 


{a, b, c, p} = {0.039, 0.0003, 0.12, 0.0001} 


for the parameters of (2.1). Use Maple to investigate the model’s behavior. 

In order to simplify the Maple code, we will use global variables, variables 
defined outside any program but available inside any program. Remember to 
load plots and to use option remember to drastically reduce the amount of 
computation needed. 


| > PredPrey := proc(n :: integer) 
local u,v,rn, fn; 
global r0, f0, a,b, c, p; 
option remember; 
if n < 1 then 
u := [r0, fO]; # Initial [rabbits, foxes] 
else 
v := PredPrey(n — 1); 
rn := v|1]; # current rabbits 
fn:=v[2]; # current fozes 
u:=|(1+a)-rn—b-rn-fn,(1—c)-fn+p-rn- fn); 
end if; 
return(u); 
L end proc: 
[> r0, f0 := 900,110: 
a,b,c, p := 0.039, 0.0003, 0.12, 0.0001 : 
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| > RF := Matrizr([seq( PredPrey(i),i = 0..200)]); 
j 900 110 


905.4000 106.7000 
911.7287460 103.5566180 
918.9615035 100.5713784 
927.0746346 97.74493550 
936.0454902 95.07722828 
945.8522811 92.56762196 
956.4739312 90.21503696 
967.8899152 88.01806562 
980.0800826 85.97507756 


II 


RF: 





201 x 2 Matrix 





(Double click the output (blue) matrix to open the Matrix Browser to see all 


the entries.) 
[> Rabbits := |seq([i, RF[i, 1]],i = 1..201)] : 


pointplot( Rabbits, title = “Rabbits” ); 
Rabbits 
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| > Fozes := [seq([i, RF[i, 2]], i = 1..201)] : 
pointplot( Foxes, title = “Foxes”); 
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100 120 140 160 180 200 


> pointplot( RF, labels = ["Rabbits", "Foxes"], title = "Predator-Prey"); 
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Running this model for even more iterations would show the plot of foxes 
versus rabbits continuing to spiral. We conclude that the model appears rea- 
sonable. We could find the equilibrium values for the system. There are two 
sets of equilibrium points for rabbits and foxes at (0,0) and (1200, 130). The 
orbits of the spiral indicate that the system is moving away from (1200, 130). 
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We conclude that this system—depending on this set of parameter values—is 
not stable. Further explorations will appear in the exercise set. 


A Discrete SIR Model of an Epidemic 


A new flu variant is spreading throughout the Unites States. The Centers for 
Disease Control and Prevention (CDC) is interested in knowing and experi- 
menting with a model for this new disease prior to it actually becoming a real 
epidemic.!° For our model, the population is divided into three categories: 
susceptible, infected, and removed. We make the following assumptions. 


e The community is closed; no one enters or leaves and there is no outside 
contact. 


e Each person is either susceptible to the new flu, infected and can spread 
the flu; or already has had the flu and now has immunity (includes those 
who have died from the disease). 


e Initially, every person is either susceptible or infected; there is no initial 
immunity. 


e Once someone contracts this flu and recovers, they have immunity and 
cannot be reinfected. 


e The average course of the disease is 2 weeks; during this time, the person 
is infected and contagious. 


The time period for our model will be one week. Define the variables 


n = the current week 
S(n) = the number that are susceptible at week n 
I(n) = the number that are infected and contagious at week n 
R(n) 


(n) = the number that have recovered (or died) and are immune at week n 


Begin by examining R(n). The length of time someone has the flu is 2 
weeks. Thus, half the infected people will be removed each week. So 


R(n + 1) = R(n) + 0.5 I(n). 


The value 0.5 is called the removal rate per week. This value represents the 
proportion of the infected persons who are ‘removed’ from infection each week. 
If real data were available, we would analyze the data to estimate the removal 
rate parameter. 

Now examine I(n). The number infected will both increase by new infec- 
tions and decrease by removed over time. As the disease lasts for 2 weeks, half 





13See https: //www.cdc.gov/flu/weekly /flusight /. 
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the number infected are removed each week, 0.5 J(n). The increase is propor- 
tional to the number of susceptibles that come into contact with an infected 
person and subsequently catch the disease, a S(n) - I(n). The parameter a, 
the likelihood that contact leads to infection, the transmission coefficient. We 
realize this is a probabilistic coefficient. We will assume that this rate is a 
constant that can be estimated from initial conditions. 

For illustration, assume the population is 1,000 students in dorms. The 
Health Center reported that 3 students were infected initially. The next week, 
5 students came to the Health Center with flu-like symptoms. We have I(0) = 
3 and S(0) = 997 and 5 new infections. Then 


5 = aS(0)I(0) = a997 -3 => a= 0.00167 
The normal course of the disease is 2 weeks, so on average, we expect 0.5 I(n) 
to recover each week leaving 0.5 J(n) remaining infected. We now have 
I(n +1) = 0.5 (n) + 0.00167 S(n)I (n) 
Last, examine S(n). The number of susceptibles is decreased only by the 


number that become infected. We use the same transmission coefficient as 
before to obtain the model 


S(n +1) = S(n) — a S(n)I(n) 
Our coupled SIR model is 


S(n + 1) = S(n) — 0.00167 S(n)I(n 
I(n + 1) = 0.5 I(n) + 0.00167 S(n)I(n) (2.2) 
R(n +1) = R(n) + 0.5 I(n) 





with ($(0), 1(0), R(0)) = (997, 3,0). 

The Discrete SIR Model (2.2) can be solved iteratively and viewed graphi- 
cally. Do this with Maple to observe the behavior to gain insights. (Use lower- 
case letters for the variables since in Maple J is predefined as I = \/—1.) 


[> DSIR := proc(n :: integer) 
local u,v, 5, 7,7; 
global s0, 70, r0; 
option remember; 
ifn <1 then 
u := [s0,i0,r0];  # Initial [susceptible, infected, removed] 
else 
v := DSIR(n — 1); 
s:=v{l]; ¥ current susceptible 
i := v[2]; #¥ current infected 
r:=v[3]; #4 current infected 
u := [s — 0.00167 - s - i, 0.5 - i + 0.00167 - s -i,r + 0.5- il; 
end if; 
return(u); 
end proc: 
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Flu := 





| > s0, 20, r0 := 997,3,0: 


997 
992.00503 
981.2451483 
958.2915651 
910.3495461 
814.6923014 
641.7442519 
388.2768258 
147.2447418 
42.2724216 


[> Flu := Matrix([seq( DSIR(k), k = 0..25)]); 


3 
6.49497 
14.00736666 
29.95726650 
62.92065225 
127.1175708 
236.5068349 
371.7208435 
426.8925058 
318.4185731 


and Military Models 


0 
1.5 
4.747485 
11.75116833 
26.72980158 
58.19012770 
121.7489131 
240.0023305 
425.8627523 
639.3090052 


26 x 3 Matrix 
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(Double click the output matrix (blue on your screen) to open the Matrix 
Browser to see all the entries.) 
Now for the graphs. 
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| > Susceptible := [seq([k, Flu[k, 1]], k = 1..26)] : 
SP := pointplot( Susceptible, title = “Susceptible”, color = red); 
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| > Infected := [seq({k, Flu[k, 2]], k = 1..26)] : 
IP := pointplot( Infected, title = “Infected”, color = blue); 
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> Removed := [seq([k, Flu[k, 3]], k = 1..26)] : 
RP := pointplot( Removed, title = “Removed”, color = gold); 
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[> display(SP, IP, RP, legend = ["Susceptible", "Infected", "Removed"}) 
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The maximum infections of the flu epidemic occurs around week 8, at the 
maximum of the Infected graph. The maximum number is slightly larger than 
400, from the table it is approximately 427. After 25 weeks, slightly more 
than 9 students never get the flu. You will be asked to check the model for 
sensitivity to the parameter values in the exercise set. 

Use the spacecurve function from the plots package to see the DDS trajec- 
tory for the flu. 
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| > pp := pointplot3d(Flu, symbolsize = 16, color = red) : 
sp := spacecure( Flu) : 
display(pp, sp, axes = frame, orientation = [—20, 70], 
tickmarks = [8,5,5]); 


800: 
600: 


400. 








What can you learn about the flu’s progression from the graph of the trajec- 
tory? Notice the spacing of the data points—remember, the data points are 
taken at equal time intervals. 


Modeling Military Insurgencies 


Insurgent forces have a strong foothold in the city of Urbania, a major 
metropolis in the center of the country of Freedonia. Intelligence estimates 
they currently have a force of about 1,000 fighters. The local police force has 
approximately 1,300 officers, many of which have had no formal training in law 
enforcement methods or in modern tactics for addressing insurgent activity. 
Based on data collected over the past year, approximately 8% of insurgents 
switch sides and join the police each week, whereas about 11% of police switch 
sides joining the insurgents. Intelligence also estimates that around 120 new 
insurgents arrive from the neighboring country of Sylvania each week. Recruit- 
ing efforts in Freedonia yield about 85 new police recruits each week as well. 
In armed conflict with insurgent forces, the local police are able to capture 
or kill approximately 10% of the insurgent force each week on average while 
losing about 3% of their force. See Figure 2.14. 
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FIGURE 2.14: Diagram of Police versus Insurgents 


Problem. 
Build a mathematical model of this insurgency. Determine the equilibrium 
state, if it exists, for the DDS. 


Variables. 
n = current time period 


p(n) = number of police in the force at the end of time period n 
r(n) = number of insurgents/rebels at the end of time period n 


for n = 0, 1, 2, ... weeks. 


Model. 
p(n + 1) = p(n) — 0.03p(n) — 0.11p(n) + 0.08r (n) + 85 
= 0.86 p(n) + 0.08 r(n) + 85 
I(n+1) =r(n) + 0.11p(n) — 0.08r(n) — 0.1r(m) + 120 
= 0.11 p(n) + 0.82 r(n) + 120 











with P(0) = 1300 and I(0) = 1000 and n = 0, 1, 2,.... 
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Use Maple to investigate the Police-Insurgents DDS. 


|> Insurgency := proc(n :: integer) 
local u,v, p,7; 
global p0,7r0; 
option remember; 
ifn <1 then 
u := [p0, r0]; 
else 
v := Insurgency(n — 1); 
p:=v|1]; # current police 
r :=v|2]; # current insurgents 
u := [0.86 p + 0.08 r + 85,0.11 p + 0.82 r + 120]; 
end if; 
return(u); 
L end proc: 
[> p0, r0 := 1300, 1000 : 


# Initial [police, insurgents] 


|> PR := Matrizx({seq(Insurgency(k), k = 0..52)]); 


PR:= 





(Double click the output matrix (blue on your screen) to open the Matrix 





1300 
1283.0 
1275.0200 
1273.452400 
1276.376104 
1282.379603 
1290.429415 
1299.772261 
1309.862354 
1320.307352 


Browser to see all the entries.) 


Now for the graphs. 





|> police := [seq([k, PR[k, 1]], k = 1..53)]; 
insurgents := [seq(|k, PR[k, 2]], k = 1..53)]; 

| > pol := pointplot(police, symbol = circle); 

ins := pointplot(insurgents, symbol = solidboz); 


1000 
1083.0 
1149.1900 
1202.588000 
1246.201924 
1282.286949 
1312.537054 
1338.227620 
1360.321597 
1379.548569 


53 x 2 Matrix 
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> display( pol, ins, labels = ["Week", 'Number'], 
title = "Insurgency Model", legend = ["Police", "Insurgents"]); 
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In this insurgency model, we see that under the current conditions, the 
insurgency overtakes the government after 5 weeks. This is unacceptable to 
the government, so we must modify conditions that affect the parameters in 
such a way to obtain a police victory. You will be asked to experiment with 


DDS’s parameters in the exercises. 


E: See eee 


Exercises 


1. Find the equilibrium values for the Predator-Prey model presented. 


2. Determine the outcomes in the Predator-Prey model with the following 


sets of parameters. 


(a) There are 200 foxes and 400 rabbits initially. 
(b) There are 2,000 foxes and 10,00 rabbits initially. 
(c) The birth rate of rabbits increases to 0.1. 


3. Find the equilibrium values for the SIR, model presented. 


4. In the SIR model determine the outcome with the following parameters 


changed. 


(a) Initially, 5 are sick, and 10 are sick the next week. 
(b) The flu only lasts 1 week. 
(c) The flu lasts 4 weeks. 
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(d) There are 4,000 students in the dorm. Initially, 5 are infected, and 30 
more are sick the next week. 


5. Find the equilibrium values analytically, if any, for the Police-Insurgents 
DDS. 


6. In the Insurgency model, determine what values of the parameters police 
recruiting, police operation losses, and police conversions to insurgents 
enable a reversal of the outcome of insurgents’ numbers overtaking police 
numbers as predicted under the current conditions. 


7. Investigate the Maple graph generated by spacecurve for the Police- 
Insurgents DDS. 
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Problem Solving with Single- Variable 
Optimization 





Objectives: 


(1) 


(2) 
(3 


a 


(4) 


Set up a solution process for single- and multivariable optimiza- 
tion problems. 


Recognize when applicable constraints are required. 


Perform sensitivity analysis, when applicable and interpret the 
results. 


Know the numerical techniques and when to use them to obtain 
approximate solutions. 





3.1 Single-Variable Unconstrained Optimization 


Rigs-R-Us has an oil-drilling rig 9.5 miles offshore. The drilling rig is to be 
connected to a pumping station via an underwater pipe. A land-based pipe 
will connect the proposed pumping station to a refinery 15.7 miles down the 
shoreline from the drilling rig. See Figure 3.1. 


Offshore 
Oil Rig 








9.5 miles 


Shoreline 


Shoreline Pumping 
Station pipeline 


| | 
| 15.7 miles 





FIGURE 3.1: Oil Pipe Route and Proposed Pumping Station 
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Underwater pipe costs $32,575 per mile to install, land-based pipe costs 
$14,442 per mile. Where should Rigs-R-Us place the pumping station along 
the shoreline to minimize the total cost of the pipe? 

In this section, we will discuss problems like the pumping station location 
that we can model and solve using single-variable calculus. First, we'll review 
the calculus concepts needed for optimization, and then apply them to appli- 
cations. Maple will handle the computations. We want to solve problems of 
the form: optimize f(x) (maximize or minimize) for x in an interval. 

If a = —co and b = œ, then we are looking at (x, f(x)) € R?—the entire 
xy-plane. If either a or b, or both, are restricted, we must consider possible 
end points in our solution. Define points x, where f’(x.) = 0 as critical points 
or stationary points. The three cases of extended critical points x, are: 


Case 1. Points xe where a < x, < band f’(x,) = 0. 
Case 2. Points xe where a < ze < b and f'(x£e) does not exist. 
Case 3. End points £e = a and x, = b of the interval [a, b]. 


A local extrema need only be an extreme value over some interval, while a 
global extreme value must be an extreme value over the entire domain of f. 
A summary of the relevant definitions and theorems from calculus follow. 


Definition. Local and Global Extrema. 
A function f has a local maximum (local minimum) at a point ciff f(x) < f(c) 
(respectively, f(a) > f(x)) for all x in an interval containing c. 

A function f has a global maximum (global minimum) at a point c iff 
f(a) < f(c) (respectively, f(x) > f(c)) for all x in the domain of f. 


Theorem. Extreme Value Theorem. 
Let f be continuous on a closed interval [a,b]. Then f must have both a global 
maximum and a global minimum over the interval. 


Remember, extreme values can occur at an interval’s endpoints, and there 
may be more than one maximum or minimum for any given function. If there 
are several local maxima, they may have different functional values, and sim- 
ilarly for minima; however, global maxima must all have the same functional 
value. Consider the extreme values of f(x) = sin(x)/x and g(x) = cos(x) for 
examples. 

If f is differentiable, we can do more. Recall your analysis of the shape 
of f’s graph leading to the first and second derivative tests in calculus. As is 
often customary in calculus classes, we’ll state the Second Derivative Test first, 
and give the First Derivative Test second, with an extension of the Second 
Derivative Test in between. 


Theorem. Second Derivative Test. 
Let f’(x-) = 0 for some ze € (a,b). Then 


e if f”(x-) > 0 (concave up), f has a local minimum at £e, 
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e if f’(a-) <0 (concave down), f has a local maximum at ze, 
e if f(x.) = 0, f has a possible inflection point (change of concavity) at ze. 


Theorem. Extended Derivative Test. 
Let f'(£e) = f” (£e) = 0 for some ze € (a,b). Then 


e if the first non-zero derivative is even-order (i.e., 4th, 6th, 8th, etc.) and 


— is positive at £e, then f has a local minimum at ze, 


— is negative at £e, then f has a local maximum at ze, 


e if the first non-zero derivative is odd-order (i.e., 3nd, 5th, 7th, etc.), then 
f does not have an extrema at ze. 


Theorem. First Derivative Test. 
Let f’(%.) = 0 for some x, € (a,b). Then 


e if f'(x) > 0 (increasing) below xe and f'(x) < 0 (decreasing) above ze, 
then f has a local maximum at ze, 


e if f'(x) < 0 (decreasing) below ze and f'(x) > 0 (increasing) above ze, 
then f has a local minimum at £e, 


e if f'(x) does not change sign around ze, then f(xe) is neither a local 
maximum nor a local minimum. 


Theorem. Nondifferentiable Point Test. 
Suppose f'(x.) does not exist. In this case, test points near ze: evaluate f at 
‘nearby’ xı and x2 where z1 < £e < T2. See Table 3.1. 


TABLE 3.1: Testing a Nondifferentiable Point 





Relation Classification 
f(ai) < f(ze) and f(a-) < f(z) then ze is not a local extrema 
f(ai) > f(z) and f(a.) < f(£2) then ze is a local minimum 
f(ai) < f(a-) and f(a-) > f(£2) then ze is a local maximum 
f(v1) > f(w-) and f(x.) > f(z) then ze is not a local extrema 


Theorem. Interval Endpoints Test. 
Suppose that f is differentiable on (a,b) and has right and left derivatives at 
x =a and b, respectively. Then, for 7 = a, 


e if f'(a) > 0, f(a) is a local minimum. 


e if f'(a) <0, f(a) is a local maximum. 
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For x = b, the opposite holds, 
e if f'(b) > 0, f(a) is a local maximum. 
e if f'(b) <0, f(a) is a local minimum. 


If f'(a) = 0 = f'(b), then use nearby test points and graphs to determine f’s 
behavior. 


Example 3.1. A Set of Simple Examples. 


1. Minimize f(x) = 1.527 +1 over R. 
Solution: Since f'(x) = 3x is 0 at z = 0 and f”(0) = +3, we see that 
x = 0 is a minimum. Since f is always concave up, x = 0 must be a global 
minimum. 


2. Maximize g(x) = x? + 1 over the interval [—1, 3]. 
Solution: Since g'(x) = 3x? is 0 at x = 0. Calculating derivatives: g”(0) = 
0, g” (0) = 6, gives the 3rd derivative as the first nonzero derivative of g; 
therefore, g has neither maximum, nor minimum at x = 0. 


Since the interval is closed, we must check the endpoints = —1 and 3. 
At the left endpoint, g'(—1) = +3; therefore, g is increasing which makes 
x = —l a minimum. At the right endpoint, g'(3) = +27, therefore g is 
increasing which makes « = —3 a maximum. 


3. Optimize h(x) = |x| over R. 
Solution: h'(x) = +1 if x > 0 and —1 if x < 0, and has no derivative at 
x = 0. Testing h at x = —0.1 and +0.1 gives 


h(—0.1)=0.1>h(0)=0 and h(0) =0<A(+0.1) = 0.1 


which indicates = 0 is a minimum. As h always decreases below x = 0 
and always increases above x = 0, we have that x = 0 is a global minimum. 


For more examples, take any calculus text and work through the optimization 
exercises in the “Applications of Derivatives” chapter. 

We’ll use familiar Maple commands for differentiation and solving equa- 
tions. Recall the commands’ usage from their Maple Help Pages. 





D - differential operator 


Calling Sequence 
D(f) 
Parameters 
f — expression which can be applied as a function 
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diff or Diff — differentiation or partial differentiation 


Calling Sequence 


dÍ 
d d f 
Tj.. Ly 
diff (F, [x1$n]) 
d? 
dz? 
Remark: these calling sequences are also valid with the inert Diff 
command 
Parameters 
f — algebraic expression or an equation 
x1, x2, ..., xj — names representing differentiation variables 
n - algebraic expression entering constructions like x$n, 


representing nth order derivative, assumed to be 
integer order differentiation 








solve — solve one or more equations 


Calling Sequence 
solve( equations, variables) 


Parameters 
equations — equation or inequality, or set or list of equations or 
inequalities 
variables — (optional) name or set or list of names; unknown(s) 
for which to solve 








fsolve — solve one or more equations using floating-point arithmetic 


Calling Sequence 
fsolve( equations, variables, complex) 


Parameters 
equations — equation, set(equation), expression, set(expression), 
list (equation) 
variables — (optional) name or set or list of names; unknown(s) for 
which to solve 
complex - (optional) literal; search for complex solutions 











Note: For simple functions f(x), Maple can use f'(x), f(x), etc., for 
derivatives. 
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Basic Applications Calculus Max-Min Theory 


Example 3.2. Chemical Manufacturing Company. 
A chemical manufacturing company sells sulfuric acid at a price of $200 per 
ton. The daily total production cost in dollars for x tons is 


C(x) = 150000 + 145.52 + 0.002522, 


and the daily production is at most 7,500 tons. How many tons of sulfuric 
acid should the manufacturer produce to maximize daily profits? 


Solution. The profit function is 


Profit = Revenue — Cost 
= (200 x) — (150000 + 145.52 + 0.00252) 
= —150000 + 54.52 —0.00252? for 0 < x < 7000 


Profit is twice differentiable, so use Maple for a Second Derivative Test. We’ll 
apply Maple’s D operator to the Profit function to compute the derivative 
functions. 
| > Profit := x + —150000 + 54.5 - x — 0.0025 - x?; 
dProfit := D( Profit); 
d2Profit := D(dProfit); 
Profit := x ++ —150000 + 54.5 - x + (—1)0.0025 - x? 
dProfit := «+ 54.5 — 0.0050 x 
d2Profit := x ++ —0.0050 


[> c= solve(dProfit(x) = 0, x); 

c := 10900. 

> Profit(c); 

L 147025.0000 

We must check the endpoints since Profit is defined on the interval [0,7500]. 


[> [0, Profit(0)], [7500, Profit(7500)]; 
[0, —150000.], [7500, 118125.0000] 


Even though Profit(c) is the largest, c = 10,900 tons is greater than the 
maximum production of 7,500 tons. So the company should produce all it 
can, 7,500 tons, for a maximum Profit(7500) = $118,125.00. 

Verify the analysis showing the end point solution is selected because the 
critical point is outside of Profit’s domain by producing graphs of Profit over 
[0,7500] and over [0, 12000]. 





Example 3.3. Computer Production. 

Bell Computers spends $250 in variable costs to produce an SP6 computer. 
They have a fixed cost of $5500 whenever the plant is in operation and comput- 
ers are produced. If Bell spends x dollars on advertising their new SP6, they 
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can sell yx SP6’s at $550 per computer. How many SP6 computers should 
the company produce to maximize profits? 


Solution. Since revenue is price times quantity and total cost is fixed cost 
plus variable costs times quantity, Bell’s profit function is given by 


[> varcost, fixedcost, price := 250, 5500, 550 : 


| > Revenue := x > price: /z : 
| Cost := x > fixedcost + x + varcost. fa : 
[> Profit := Revenue — Cost: + Note this is an equation of functions! 
Profit(«); 
300x — 5500 — x 


Profit is twice differentiable, so again use Maple for a Second Derivative Test. 
| > dProfit := D(Profit) : 


dProfit(x); 
c := solve(dProfit(x) = 0, x); 





150 
= 
c := 22500 
[> d2Profit := D(dProfit); 
d2Profit(c); 
v 22500 
6750000 


Since Profit” (c) is negative, we have the maximum Profit(22500) = $17,000. 
A graph will verify our analysis. 


| > plot( Profit, 0..30000, thickness = 2, title = “Bell Computer Profit”); 


Bell Computer Profit 
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Return to the pumping station example that opened this section. 
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Example 3.4. Oil Pumping Station Location. 

Rigs-R-Us has an oil-drilling rig 9.5 miles offshore. The drilling rig will be con- 
nected to a pumping station via an underwater pipe which then connects to a 
refinery by a land-based pipe. The refinery is 15.7 miles down the shoreline from 
the drilling rig. See Figure 3.2. Underwater pipeline costs $32,575 per mile 


Offshore 
Oil Rig 








9.5 miles 


Shoreline 


Shorelin Pumping 
Station pipeline 


| | 
15.7 miles 





FIGURE 3.2: Oil Pipe Route and Proposed Pumping Station 


to install, land-based pipeline costs $14,442 per mile. Determine a location for 
the pumping station that minimizes the total cost of the pipeline. 


Problem. 
Find a relationship between the location of the pumping station and cost 
of the installation of the pipe, then minimize the cost. 


Assumptions. 

First, we assume no cost saving for the pipe if we purchase in larger lot 
sizes. We further assume the costs of preparing the terrain to lay the pipe, 
both offshore and on land, are captured in given cost per mile figures. 


Variables. 
x = the location of the Pumping station along the horizontal distance 


from x = 0 to x = 15.7 miles 
TC = total cost of the pipe for both underwater and on-land piping 


Model Construction. 

Use the Pythagorean Theorem for the underwater distance of the pipe; 
that is, the hypotenuse of the right triangle with height 9.5 miles and base 
x miles along the shoreline. The length of the hypotenuse is 9.5? + x?. The 
length of the pipe on shore is 15.7 — x. Therefore total cost is 


TC = 32575 - \/9.52 + £? + 14442 - (15.7 — 2). 


We turn to Maple for a Second Derivative Test to determine the minimum 
cost Pumping Station location. 
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| > TC := x > 32575 - /9.52 + a? + 14442 - (15.7 — x) : 


[> dTC := D(TC); 


TO 43 2... ja 


/90.25 + x2 





[> c= solve(dT C(x) = 0, x); 

c := 4.698818368 

[> d2TC := D(dTC); 

32575 32575x? 


d2TC := 2H 
v 90.25 + x? (90.25 + z2)? 





[> d2TC(c); 





| 2469.416868 
Since the second derivative is positive at c = 4.6988, we have a minimum 
total cost. Thus, if the pumping station is located at 15.7 — 4.698 = 11.0 
miles from the refinery, we will minimize the total pipeline cost with TC(c) = 
$50,4126.27. 

Once again, we graph the total cost function to verify our results. 


| > plot(TC, 0..8, thickness = 2, title = “Pipeline Cost”); 


Pipeline Cost 
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How would the plot command be modified to show a point on the curve at 
the minimum? 
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Exercises 


1. Each morning during rush hour, 10,000 people travel from New Jersey to 
New York. The trip lasts 40 minutes taking the subway. If x thousand 
people drive to New York City, it takes 20 + 5x minutes to make the trip. 


(a) Formulate the problem with the objective to minimize the average 
travel time per person. 

(b) Find the optimal number of people who drive so that the average 
time per person is minimized. 


2. Let f(x) = 2x? — 1. Use calculus to find the optimal solution to 


a f(z) 


3. Find all extrema for f(x) = sin(2 + 7x)/x on the interval 1 < x < 4. 
4. Consider the function f(x) = 0.523 — 827. 


a. Use a graph to find and classify all extrema of f. 


b. Use analytical techniques (i-e., calculus) to find and classify all 
extrema of f. 


c. Define: A differentiable function is 
e convex or concave up on the interval [a,b] iff f lies above all of 


its tangents; that is, for all zı < x < x in |a, b], 


f (x2) — f(a1) 


T2 — Tı 


f(@) < 


e concave or concave down on the interval [a,b] iff f lies below all 
of its tangents; that is, for all xı < xz < 22 in [a,b], 


f(z2) = flv) 


T2 — Tı 


F(=) 2 


A twice differentiable function is 


e convex on [a,b] iff fs second derivative is non-negative. 
e concave on |a, b] iff fs second derivative is non-positive. 


Apply the definition of concave (concave down) to show that f is 
concave over the interval [—6, 5]. 


d. What is the largest interval where f is convex? Concave? 


e. Why is knowing the concavity important in optimization? 
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5. Dr. E. N. Throat has been taking x-rays of the trachea contracting during 
coughing. He has found that the trachea appears to contract by 33% (1/3) 
of its normal size. He has asked the Department of Mathematics to confirm 
or deny his claim. You perform some initial research and find that under 
reasonable assumptions about the elasticity of the tracheal wall and about 
how air near the wall is slowed by friction, the average flow of velocity v 
can be modeled by the equation 


v = c(ro — r)? cm/sec for iro <r <ro 


where c is a positive constant (let c = 1) and ro is the resting radius of 
the trachea in centimeters. 


(a) Find the value of r that maximizes v. 


(b) Does your result support or deny Dr. Throat’s claim. 





i] 
3.2 Numerical Search Techniques with Maple 
Single-Variable Techniques 


When calculus methods aren’t feasible, we turn to numerical approximation 
techniques. The basic approach of most numerical methods in optimization 
is to produce a sequence of improved approximations to the optimal solution 
according to a specific scheme. We will examine both elimination methods 
(Golden section, and Fibonacci) and interpolation methods (Newton’s). 

In numerical methods of optimization, a procedure is used in obtaining 
values of the objective function at various combinations of the decision vari- 
ables, and conclusions are then drawn regarding the optimal solution. The 
elimination methods can be used to find an optimal solution for even discon- 
tinuous functions. An important relationship (assumption) must be made to 
use these elimination methods. The function must be unimodal. A unimodal 
function—uni-modal from “one mode”—is one that has only one maximum 
(peak) or one minimum (valley), but not both. State this mathematically as 


Definition. Unimodal Function. 
A function is unimodal on the interval [a,b] iff 


1. f has a maximum (minimum) z* in (a,b), and 


2. is strictly increasing (decreasing) on |a, x*], i.e., x < a* => f(x) < f(x"), 
and 


3. is strictly decreasing (increasing) on [x*,b] i.e., if x > a* => f(x) < f(a"). 


106 Single- Variable Optimization 


Examples of unimodal and non-unimodal functions are shown in the Fig- 
ure 3.3. Unimodal functions may or may not be differentiable or even contin- 
uous. 











(a) Differentiable & Unimodal (b) Continuous & Unimodal 


—_— | 


a 











(c) Noncontinuous & Unimodal (d) Non-Unimodal 


FIGURE 3.3: Examples of Unimodal and Non-Unimodal Functions 


Thus, a unimodal function can be nondifferentiable (have corners) or even 
discontinuous. If a function is known to be unimodal in a given closed interval, 
then the optimum (maximum or minimum) can be found. 

In this section, we will learn many techniques for numerical searches. For 
the “elimination methods,” we accept an interval answer. If a single value 
is required, we usually evaluate the function at each end point of the final 
interval and the midpoint of the final interval, then take the optimum of those 
three values to approximate our single value. 


Golden Section Search 


A technique that searches for a value by determining narrower and narrower 
intervals containing the target value is called a bracketing method. A Golden 
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Section search is a type of bracketing method that uses the golden ratio!. This 
recent technique was developed in 1953 by Jack Keifer. 

To better understand the golden ratio, consider a line segment that is 
divided into two parts as shown in Figure 3.4. If the ratio of the length of 





FIGURE 3.4: The Golden Ratio 


the whole line to the length of the larger part is the same as the ratio of the 
length of the larger part to the length of the smaller part, we say the line is 
divided into the golden ratio. Symbolically, taking the length of the original 
segment as 1, this can be written as 





Algebraic manipulation of the relationship above yields r? +r — 1 = 0. The 
quadratic’s positive root r = (/5—1)/2 satisfies the ratio requirements for the 
line segment above. The golden ratio’s numerical value is ¢ ~ 0.6180339880. 
(The traditional symbol for the golden ratio is ¢.) This well-known ratio is 
the limiting value for the ratio of the consecutive terms of the Fibonacci 
sequences, which we will see in the next method. Another bracketing method, 
the Fibonacci search is often used in lieu of the Golden Section method. 

In order to use the Golden Section search procedure, we must ensure that 
certain assumptions hold. These key assumptions are: 


1. The function must be unimodal over a specified interval, 


2. the function must have an optimal solution over a known interval of uncer- 
tainty, and 


3. we will accept an interval solution since the exact optimal value cannot be 
found by bracketing, only approximated. 


Only an interval solution, that is, an interval containing the exact optimal 
value, known as the final interval of uncertainty, can be found using this 
technique. The length of the final interval is controlled by the user and can be 
made arbitrarily small by selecting a tolerance value. Assuming unimodality 
guarantees that the final interval’s length is less than the chosen tolerance. 





1 See http: //mathworld.wolfram.com/GoldenRatio.html. 
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Finding the Maximum of a Function over an Interval with a 
Golden Section Search 


Suppose we know that f is unimodal on an interval J. Break I into two 
subintervals J; and Jz, then f’s maximum must be in one or the other. Check 
a test point in each interval—the higher test point indicates which interval 
contains the maximum since f is unimodal and has exactly one optimum 
value. We can reduce the number of times we have to evaluate the function 
by having the intervals overlap and using the endpoints as test points. We 
can further reduce the number of function evaluations by cleverly choosing 
the test points so that the next iteration reuses one of the current test points. 
These ideas lead to the Golden Section search developed by Prof. Jack Kiefer 
in his 1953 master’s thesis.” 

A Golden Section search for a maximum is iterative, requiring evaluations 
of f(x) at test points xı and x2, where 


zı =a+r(b—a) and z2=b-r(b-— a). 


The test points will lie inside the original interval Ig = [a,b] and are used 
to determine a new, smaller search interval I4. Choosing r carefully lets us 
reuse the function’s evaluations in the next iteration: either xı or x2 will be 
the new interval endpoint, and the other test point will be a test point in 
the new, reduced interval. For a Golden Section Search, r is chosen to be the 
golden ratio 0.618. If f(x1) > f(a2), the new interval is [x2, b], and if f(a1) < 
f (x2), the new interval is [a, 21]. Continue to iterate in this manner until the 
final interval length is less than the chosen tolerance. The final interval Iy 
contains or brackets the optimum solution. The length of the final interval Iy 
determines the accuracy of the approximate optimum solution. The maximum 
number of iterations N required to achieve this desired tolerance or final 


interval length is 
1 tolerance 
N= -l . 
ie a( b—a ) 


When we are required to provide a point solution, instead of a small inter- 
val containing the solution, evaluate the function at the end points and the 
midpoint of the final interval. Select the value of x that yields the largest f(x). 

How can the procedure be changed to search for a minimum point? 

Figure 3.5 shows the progression of subintervals selected by a Golden Sec- 
tion search. 








2See J. Kiefer, “Sequential minimax search for a maximum,” Proc Am Math Soc, 4 (3), 
1953, 502-506. 
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FIGURE 3.5: Golden Section Search Interval Progression 


The Golden Section search is written in algorithmic form below. 





Golden Section Search Algorithm 


Find the maximum of the unimodal function f on the interval [a,b]. 
INPUTS: Function: f 
Endpoints: a, b 
Tolerance: t 
OUTPUTS: Optimal point «* and maximum value f(x*) 


Step 1. Initialize: Set r = 0.618, ap = a, bp = b, counter k = 1, 
limit N = [In(t/(b — a))/n(0.618)] 


Step 2. Calculate the test points xı = ag_1 + r(bk—-1 — ak—1) 
and T2 = bk—ı = r(bk—1 = ak1). 


Step 3. Compute f(xı) and f(x2) 


Step 4. If f(x1) > f(x£2), then Ip = [x2,b]; that is a = x2 
If f(a1) < f(ae), then I, = fa, xı]; that is b = xı 


Step 5. If k= N or bk — ap < t, then STOP 
Return estimates «* = midpoint(J,) and f(x*). 
Otherwise, set k = k + 1 and return to Step 2. 











Although a Golden Section search can be used with any unimodal func- 
tion to find the maximum (or minimum) over a specified interval, its main 
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advantage comes when normal calculus procedures fail. Consider the follow- 
ing example. 


Example 3.5. Maximizing a Nondifferentiable Function. 
Maximize 
f(x) = —|2 — z| — |5 — 4a] — |8 — 92 


over the interval [0, 3]. 





Absolute value functions are not differentiable because they have corner 
points—graph f to see this. Thus, taking the first derivative and setting it 
equal to zero is not an option. We’ll use a Golden Section search in Maple to 
solve this problem. Our Maple procedure GoldenSectionSearch is in the text’s 
PSMv2 package. Use with(PSMv2) to make the procedure available. 











[> with(PSMv2) : 
T> f := £ > -|2 — z| — |5 — 42] — |8 — 921; 
L f := x |> -|2 — z| — |5 — 4z| — |8 — 92 
> GoldenSectionSearch( f,0,3,0.1) : 
fnormal(%,5); #to fit the output to the page width 
The maximum is f(0.907318)=-2.629270 with tolerance 0.100000 
| Iteration zı T2 f(x) f (x2) Interval 
0 1.8541 1.1459 11.249 —3.5836 (0.0, 3] 
1 1.1459 0.70820 —3.5836 —5.0851 (0.0, 1.8541] 
2 1.4164 1.1459 —5.9969 -—3.5836  [0.70820, 1.8541] 
3 1.1459 0.97871 —3.5836 —2.9149  [0.70820, 1.4164] 
4 0.97871 0.87539 —2.9149 —2.7446 [0.70820, 1.1459] 
5 0.87539 0.81153 —2.7446 —3.6386 [0.70820, 0.97871] 
6 0.91486 0.87539 —2.6594 —2.7446 [0.81153,0.97871] 
7 0.93925 0.91486 —2.7570 —2.6594 [0.87539, 0.97871] 
ly 8 0.91486 0.89978 —2.6594 —2.5991 [0.87539, 0.93925] 





The midpoint of the final interval is 0.907318 and f(midpoint) = 
—2.629270, so we estimate the maximum of the function is —2.629 at the 
x value 0.9 (to within 0.1). 


Example 3.6. Maximizing a Transcendental Function. 
Maximize the function 

1 
Ieg 





g(x) = 1—exp(—z) 4 


over the interval [0,20]. (Remember that exp(—x) = e~*.) 
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[> >1 (=x) a 
= ex. x) + : 
[> gme p a 
> GoldenSectionSearch(g,0,20,0.001) : 


fnormal(%,5); #to fit the output to the page width 
The maximum is f(2.512908)=1.203632 with tolerance 0.001000 





Iteration xı T2 f(ai) f(z2) Interval 
0.0 12.361 7.6393 1.0748 1.1153 [0.0, 20] 
1 7.6393 4.7214 1.1153 1.1659 (0.0, 12.361] 
2 4.7214 2.9180 1.1659 1.2012 (0.0, 7.6393] 
3 2.9180 1.8034 1.2012 1.1920 (0.0, 4.7214] 
4 3.6068 2.9180 1.1899 1.2012 |1.8034, 4.7214 
5 2.9180 2.4922 1.2012 1.2036 [1.8034, 3.6068 
6 2.4922 2.2291 1.2036 1.2021 [1.8034, 2.9180 
7 2.6548 2.4922 1.2033 1.2036 |2.2291,2.9180 
8 2.4922 2.3917 1.2086 1.2034 |2.2291,2.6548 
9 2.5543 2.4922 1.2036 1.2036 |2.3917,2.6548 


10 2.4922 2.45388 1.2036 1.2036 |2.3917, 2.5543 
11 2.5160 2.4922 1.2036 1.2036 |2.4538, 2.5543 
12 2.5306 2.5160 1.2036 1.2036 |2.4922,2.5543 
13 2.5160 2.5069 1.2036 1.2036 [2.4922, 2.5306 
14 2.5216 2.5160 1.2036 1.2036 [2.5069, 2.5306 
15 2.5160 2.5125 1.2036 1.2036 (2.5069, 2.5216 
16 2.5125 2.5104 1.2036 1.2036 (2.5069, 2.5160 
17 2.5138 2.5125 1.2036 1.2036 |2.5104, 2.5160 
18 2.5125 2.5117 1.2036 1.2036 |2.5104, 2.5138 
19 2.5130 2.5125 1.2036 1.2036 |2.5117,2.5138 
20 2.5133 2.5130 1.2036 1.2036 [2.5125,2.5138 
L 21 2.5130 2.5128 1.2036 1.2036 [|2.5125,2.5133 


The Golden Section search gives the maximum of the function is 1.204 at the 
x value = 2.513. 
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Fibonacci Search 


A Fibonacci search uses the ratio of Fibonacci numbers to generate a sequence 
of test points based on the expression 
Fy-1 a Fy-2 : 

Fy, Fn 








ff, = Fa- HEr- = 1 = 


Since limn—+oo Fn—1/Fn equals the golden ratio, A Fibonacci search is a Golden 
Section search “in the limit.” However, a Fibonacci search converges faster 
than a Golden Section search. 

In order to use the Fibonacci search procedure, we must ensure that the 
key assumptions hold: 


1. The function must be unimodal over a specified interval, 


2. The function must have an optimal solution over a known interval of uncer- 
tainty, and 


3. We will accept an interval solution since the exact optimal value cannot 
be found by bracketing, only approximated. 


These are the same assumptions as required for a Golden Section search—the 
two searches are closely related bracketing methods. 

As before, only an interval solution—the final interval of uncertainty—can 
be found using this or any bracketing technique. The length of the final interval 
or tolerance value is controllable by the user, and can be made arbitrarily small 
restricted only by the precision of the computations and the computing time 
available. The final interval is guaranteed to be less than this tolerance value 
within a specific number of iterations. 

Replace the golden ratio r in the test point generating formulas of the 
Golden Section search with the ratio of Fibonacci numbers to obtain 











Fy, 

“1, =at a -(b-a) 
Fa- 

T2 5047F n -(b-a) 


These are the Fibonacci search test point generators. The test points must lie 
inside the original interval [a,b] since a < xı < x2 < b, and will determine the 
new search interval in the same fashion as a Golden Section search. 


If f(x1) < f(x), then the new interval is [x1, b]. 
If f(x1) > f(x), then the new interval is [a, xə]. 


Equality means calculation precision is exceeded or a calculation error has 
occurred. 

The iterations continue until the final interval length is less than our 
imposed tolerance. Our final interval must contain the optimum solution. The 
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size of the final interval determines the accuracy of the approximate optimum 
solution and vice versa. The number of iterations required to achieve this 
accepted interval length is the smallest Fibonacci number that satisfies the 
inequality 
b-a 
tolerance 

When we require a point solution, instead of an interval solution, the 
method of selecting a point is to evaluate the function, f(a) at the end points 
and midpoint of the final interval. For maximization problems, select the end- 
point or midpoint that yields the largest f(x) value. 

The Fibonacci search algorithm follows. 





Fibonacci Search Algorithm 


Find the maximum of the unimodal function f on the interval [a,b]. 
INPUTS: Function: f 
Endpoints: a, b 
Tolerance: t 
OUTPUTS: Optimal point z* and maximum value f(2*) 


Step 1. Initialize: Set ao = a, bp = b, counter k = 1 


Step 2. Calculate the number of iterations n such that F» is the 
first Fibonacci number where (b — a)/t < Fn 


Step 3. Calculate the test points 











Fy 

Ly =at z -(b— a) 
Fn- 

z2 =a + A -(b—a) 


Step 4. Compute f(xı) and f(x2) 


Step 5. If f(x1) < f(x2), then the new interval is [«1, b] 
Set Tı = T2 
Compute the new x2 (for the new interval) 
If f(x1) > f(x), then the new interval is [a, x2] 
Set tq = z1 
Compute the new xı (for the new interval) 


Step 6. If n = 2 or bn — an < t, then STOP 
Return estimates x* = midpoint(In) and f(x"). 
Otherwise, set n = n — 1 and return to Step 4. 
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Although a Fibonacci search can be used with any unimodal function to 
find the maximum (or minimum) over a specified interval, its main advantage 
comes when normal calculus procedures fail, such as when f is not differen- 
tiable or continuous. Redo the first example of a Golden Section search. 


Example 3.7. Maximizing a Nondifferentiable Function (reprise). 
Maximize 





f(x) = -|2 — x| — |5 — 42| — |8 — 9a| 
over the interval [0,3]. 


We’ll use the FibonacciSearch program from the text’s PSMv2 library 
package. 











[> f := z > —|2—2| — |5 — 4z| — |8 — 92]; 
L f := x |> -|2 — z| — |5 — 4z| — |8 — 92 
> FibonacciSearch(f,0,3,0.1) : 
fnormal(%,5); #to fit the output to the page width 
The maximum is f(0.905658)=-2.622630 with tolerance 0.100000 
| Iteration Ly T2 f(z) f (2) Interval 
0.0 1.1471 1.8529 3.5882 —11.235 (0.0, 3] 
1 0.70588 1.1471 —5.1176 —3.5882 (0.0, 1.8529] 
2 1.1471 1.4160 —3.5882 -—5.9916  [0.70588, 1.8529] 
3 0.97639 1.1471 —2.9056 -—3.5882 [0.70588, 1.4160] 
4 0.87395 0.97639 —2.7647 —2.9056  [0.70588, 1.1471] 
5 0.80893 0.87395 —3.6749 —2.7647 [0.70588, 0.97639] 
6 0.87395 0.91260 —2.7647 —2.6504 [0.80893, 0.97639] 
7 0.91260 0.93737 —2.6504 —2.7495 [0.87395, 0.97639] 
L 8 0.89811 0.91260 —2.5924 —2.6504 [0.87395, 0.93737] 





How does this answer compare with the optimum found by the Golden Section 
search? 
Now redo the second example from earlier. 


Example 3.8. Maximizing a Transcendental Function (reprise). 
Maximize the function 
1 


1l+2 





g(x) = 1—exp(—z) 4 


over the interval [0, 20]. 


1 . 
l+a- 








> g := x —> 1 — exp(—x) 4 
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| > FibonacciSearch(g, 0, 20,0.1) : 
fnormal(%,5); #to fit the output to the page width 
The maximum is f(2.523088)=1.203630 with tolerance 0.001000 
| Iteration 2 T2 f(a) f(a) Interval 
0.0 7.6395 12.361 1.1153 1.0748 (0.0, 20] 
1 4.7210 7.6395 1.1659 1.1153 (0.0, 12.361] 
2 2.9179 4.7210 1.2012 1.1659 (0.0, 7.6395] 
3 1.8032 2.9179 1.1920 1.2012 (0.0, 4.7210] 
4 2.9179 3.6066 1.2012 1.1899 |1.8032, 4.7210 
5 2.4920 2.9179 1.2036 1.2012 |1.8032,3.6066 
6 2.2289 2.4920 1.2021 1.2036 [1.8032,2.9179 
7 2.4920 2.6547 1.2036 1.2033  |2.2289,2.9179 
8 2.3916 2.4920 1.2034 1.2036 |2.2289, 2.6547 
9 2.4920 2.5542 1.2036 1.2036 |2.3916, 2.6547 
10 2.4537 2.4920 1.2036 1.2036 |2.3916, 2.5542 
11 2.4920 2.5158 1.2036 1.2036 |2.4537, 2.5542 
L 12 2.5158 2.5304 1.2036 1.2036 [2.4920,2.5542 








The Fibonacci search gives the maximum as f(2.5) = 1.204. How does this 
answer compare with the optimum found by the Golden Section search? 


Interpolation with Derivatives: Newton’s Method 
Finding the Critical Points (Roots of the Derivative) of a Function 


Newton’s Method can be adapted to solve nonlinear optimization problems. 
For a twice differentiable function of one variable, the adaptation is straight- 
forward. Newton’s Method is applied to the derivative, rather than the original 
function we wish to optimize—the function’s critical points occur at roots of 
the derivative. Replace f with f’ in the iterating formula for Newton’s Method: 


N(x) =x — H(z) = M(«)=2- a 


f'(x) 

Given a specified tolerance ¢ > 0, the iterations of Newton’s Method can be 
terminated when |x,41 — zg| < € or when |f’ (zp)| < €. 

In order to use the modified Newton’s Method to find critical points, the 

function’s first and second derivatives must exist throughout the neighborhood 

of interest. Also note that whenever the second derivative at x; is zero, the 
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next point 2,41 cannot be computed. (Explain why!) This method finds only 
approximates the value of critical points; it does not know whether it is finding 
a maximum or a minimum of the original function. Use the sign of the second 
derivative to determine whether the critical value is a maximum, minimum, 
or neither (a likely inflection point). 


Example 3.9. A Simple Application. 
Maximize the polynomial p(x) = 5x2 — x? over the interval (0, 7]. 


This simple problem is easy to solve using elementary calculus. 
1. Differentiate: p’ = 5 — 2x. 
2. The only critical point, the root of p’, is x* = 5/2. 
3. Differentiate again: p” = —2 < 0. 


4. The Second Derivative Test indicates that p has a maximum of 25/4 at 
y“ = 5/2. 

Let’s use Maple to implement the modified Newton’s method for finding the 

maximum of p. 


[> p:= £ > 5r- r? : 





dp := D(p): 
ddp := D(dp) : 
S dp(x) | 
> Newton := z > t — dpa) : 


Start with xp = 1.0. 


[> zo := 1.0: 
zı := Newton(xo); 
z2 := Newton(zx1); 
x[0] := 1.0 
x(t] := 2.500000000 
x[2] := 2.500000000 


Starting at any other value also yields x* = 2.5. Since this simple quadratic 
function has a linear derivative, the linear approximation of the derivative 
will be exact regardless of the starting point, and the answer will appear at 
the second iteration. See Figure 3.6. Algebraically simplifying the method’s 
iterating function for p confirms this: Newton(x) = 2.5. 
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FIGURE 3.6: Plot of p(x) = 5x — x? with its Linear Approximation at 2 = 4 


The slope of the linear approximation of the function at the point is pre- 
cisely the slope of the function at that point, so the linear approximation is 
tangent to the function at the point as shown in the figure. 


Example 3.10. A Third Degree Polynomial. 
Find a maximal point, if possible, for p(x) = —2x23 + 10x + 5 on the interval 
[0,3] to a tolerance of € = 0.01. 


We’ll use the same process in Maple, this time, with the starting value 
zx = 1.0. 


[> p:= z > —2r? + 10x +5: 








dp := D(p) : 

ddp := D(dp) : 
| 2 dp(x) | 
> Newton := £t > £ — dd 


Start again with £o = 1.0. 


[> zo := 1.0: 
x1 := Newton(zo); 
x2 := Newton(x1); 
x[0] := 1.0 
a(1] := 1.333333333 
x[2] := 1.291666667 
|> abs(a[2] — a[1]) < 0.01 
0.041666666 < 0.01 
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The desired tolerance has not been achieved—we’re not finished. 


| > z3 := Newton(x2); 
|v3 — X2| < 0.01; 
x3 := 1.290994624 
L 0.000672043 < 0.01 


We have a result! Checking the second derivative shows that p(1.29) = 13.6 
is our maximum (to within 0.01). 

Maple’s Student|CalculusI] package contains NewtonsMethod for finding 
the roots of a function. To use NewtonsMethod to find critical points, all we 
have to do is replace the function with its first derivative. We illustrate below. 





| > with(Student[Calculus1]) : 





> p:= x > —2r? +102 +5: 
L dp:= D(p): 
> NewtonsMethod(dp(x), x = 1.0); 
1.290994449 
| > NewtonsMethod(dp(ax),x = 1.0, output = sequence); 
L 1.0, 1.333333333, 1.291666667, 1.290994624, 1.290994449 
[> NewtonsMethod(dp(£), x = 0.3, output = plot); 


Newton's Method 





-104 


-207 


-304 


-40-4 





-50Ṣ4 














f(x) Tangent lines 
From the initial point x = 0.3, at most 5 iteration(s) of 


Newton's method for f(x) = -6 ¢ +10 





Try NewtonsMethod with output=animation. 
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Exercises 


Compare the results of using the Golden Section, Fibonacci’s, and Newton’s 
methods in the following. 


1. Maximize f(x) = —ax? —2z on the closed interval [—2, 1]. Using a tolerance 
for the final interval of 0.6. Hint: Start Newton’s method at « = —0.5. 


2. Maximize f(x) = —x? —3z on the closed interval [—3, 1]. Using a tolerance 
for the final interval of 0.6. Hint: Start Newton’s method at x = 1. 


3. Minimize f(x) = x? + 2x on the closed interval [—3, 1]. Using a tolerance 
for the final interval of 0.5. Hint: Start Newton’s method at « = —3. 





4. Minimize f(x) = —x +e” over the interval [—1, 3] using a tolerance of 0.1. 
Hint: Start Newton’s method at x = —1. 


5. List at least two assumptions required by both Golden Section and 
Fibonacci’s search methods. 


6. Consider minimizing f(x) = —x + e” over the interval [—1,3]. Assume 
the search method yielded a final interval of [—0.80,0.25] to within the 
tolerance of e. Report a single best value of x to minimize f(x) over the 
interval. Explain your reasoning. 


7. Modify the Golden Section (pg. 109) algorithm to find a minimum value of 
a unimodal function. Write a Maple procedure to implement your change. 


8. Modify the Fibonacci search (pg. 113) algorithm to find a minimum value 
of a unimodal function. Write a Maple procedure to implement your 
change. 








Projects 


Project 3.1. Write a program in Maple that uses a secant method search. 
Apply your program to Exercises 1 through 4 above. 


Project 3.2. Write a program in Maple that uses a Regula-Falsi search 
method. Apply your program to Exercises 1 through 4 above. 


Project 3.3. Carefully analyze the rate of convergence of the different search 
methods presented. 
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A 


Problem Solving with Multivariable 
Constrained and Unconstrained Optimization 











Objectives: 


(1) Set up asolution process for multivariable optimization problems. 
(2) Recognize when applicable constraints are required. 


(3) Perform sensitivity analysis, when applicable, and interpret the 
results. 


(4) Know the numerical techniques to obtain approximate solutions 
and when to use them. 








4.1 Unconstrained Optimization: Theory 


A small company is planning to install a central computer with cable links to 
five new departments. According to their floor plan, the peripheral computers 
for the five departments will be situated as shown in Figure 4.1 below. The 
company wishes to locate the central computer to minimize the amount of 
cable used to link the five peripheral stations because signals degrade with 
cable length and high quality cable is expensive. Assume that cable is strung 
above the ceiling panels in a straight line from a point above a peripheral to a 
point above the central computer; the standard distance formula may be used 
to determine the length needed to connect a remote to the central computer. 
Ignore the segments of cable from any device to a point above the ceiling 
panel immediately over that device. That is, work only with lengths of cable 
strung over the ceiling panels—the cable segment from the peripheral to the 
ceiling doesn’t change with the different routings. The location coordinates of 
the five peripheral computers appear in Table 4.1. 


TABLE 4.1: Grid of the Five Departments 


X | 15 25 60 75 80 


Y |60 90 75 60 25 
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FIGURE 4.1: Grid of the Five Departments 


The central computer will be positioned at coordinates (m,n) where m 
and n are both integers in the grid representing office space. Determine the 
coordinates (m,n) that minimize the total amount of cable needed. Report 
the total number of feet of cable needed for this placement along with the 
coordinates (m, n). 

To model and solve problems like this, we need multivariable optimization. 

In this chapter, our goal is to introduce both constrained and uncon- 
strained nonlinear optimization. For a more thorough coverage, we suggest 
studying complete texts on the subject such as Ruszczynski’s Nonlinear Opti- 
mization (Princeton Univ. Press, 2006). 


Basic Theory 


We begin with how to find an optimal solution, if one exists, for the uncon- 
strained nonlinear optimization problem 


Maximize (or minimize) f (£1, £2,..-, £n) over R” 


We assume that all the first and second partial derivatives of f exist and 
are continuous at all points in f’s domain. Let the partial derivative of f with 
respect to x; be 

Of (a1, 22,.--,2n) 
Ox; i 
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Candidate critical points (stationary points) are found where 


Of (x1, %2,---,2n) 
Ox; 





= 0 fori =1,2,...,n 


This set of conditions gives a system of equations that when solved yields one 
or more critical points (if any exist) of f. Each critical point will satisfy each 
equation Of (critical point) /Ox; = 0 fori =1,...,n. 


Theorem. Local Extremum Characterization. 


If x* = (xï, x£3,..., £) is a local extremum of the twice continuously differ- 
entiable function f, then x* satisfies 
Of(x") 





=0 for i=1,2,...,n. 
Dn, ri n 
We have previously defined all points x that satisfy Of(x)/Ox; = 0 for 
i = 1, 2,..., n as critical points or stationary points. Not all critical points 
(stationary points) are local extrema. If a stationary point is not a local 
extremum (a maximum or a minimum), then it is called a saddle point. 


The Hessian Matrix 


We used a function’s concavity to recognize extrema in the single-variable 
case. How do we determine the “concavity” of functions of more than one 
variable? 


Definition. Convex Multivariable Function. 
A function f(x) is convex iff for every pair of points x“) and x® in f’s 
domain and any A € [0,1], we have 


f (a — d)x) 4 dx) <(1—A)f(xM) 4 Af (x). 


Similarly, f is concave iff for every pair of points x“) and x) in f’s domain 
and any A € [0, 1], we have 


f (a — d)x) 4 dx) > (1—A)f(xM) 4 AF (x). 


In words, this definition says that f is convex iff whenever we look at an 
x* on the line segment connecting x“) and x), the value of the function at 
f(x*) is below the point the same distance along the line segment connecting 
f(x) and f(x()). For concave, change “below” to “above.” 

Next, we introduce the Hessian matrix, named for Ludwig Hesse who 
studied surfaces in the 1800s, that allows us to determine the convexity of 
multivariable functions. The Hessian matrix will also provide additional infor- 
mation about the critical points. 
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Definition. The Hessian Matrix of f. 
The Hessian matrix of a multivariable function f is the n x n matrix of second 
partial derivatives where the (i, j)th entry is Of /(0x,0z;) 














of arf arf 
Ox? Ox, 22 oe OL1 Ly 
arf arf arf 
Ox221 da oe OL2Tn 
H(f)= 
Zi f f 
OLn T1 OLnT2 GEFA 


When f has continuous second partial derivatives, the mixed partials are equal 
and the Hessian matrix is symmetric.! 


In the 2-variable case of a function with continuous second partials, 





of af 

Ox2 02122 
H(f) = af af 
02122 GEZ 


Example 4.1. A Simple 2 x 2 Hessian. 
Let f(£1, £2) = 2? + 3x2. Find f’s Hessian. Since 





Of 
—*_ = 2 m 
xı mo Ox Gis 
then of of af 
ee 2 = — = 
Ou? ” xix o, ðr? 6 


So f’s Hessian is 
2 0 
0 6 


Example 4.2. Another 2 x 2 Hessian. 





Let g(#1, 22) = —x? + 34122 — 3x2. Find g’s Hessian. Since 
ð 0 
oes —2271 + 322, “5 . —6%2 + 321 
Ox 0x2 
then 
07g a 07g 25 07g E 
Ox?” Oana 





1The equality of mixed partials for smooth functions is called Schwarz’s Theorem or 
Clairaut’s Equality of Mixed Partials Theorem. Euler had discovered this very important 
result in 1734 when he was 27 years old. 
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fs 


The Hessian will play the role of the second derivative in the Second Deriva- 
tive Test as we extend from single-variable to multivariable functions. We will 
use determinants to analyze the Hessian. Recall, the determinant of a 2 x 2 
matrix A is 


So g’s Hessian is 


11 412 
21 422 


|A| = 


= 411422 — 412421. 





a 
a 





Calculating determinants of larger matrices is defined recursively using expan- 
sion by minors.” We can use leading principal minors to analyze the Hessian. 


Definition 4.1. Leading Principal Minor. 

The ith leading principal minor of a square matrix A is the determinant of 
the 7 x i submatrix obtained by deleting n — i rows of A and deleting the same 
n — i columns of A where i = 0, 1, 2,...,n—1. 


Example 4.3. The Leading Principal Minors of a Hessian Matrix. 
Given the 3 x 3 Hessian matrix 
2 0 4 
H= |0 1 5], 
4 5 3 
determine all leading principal minors. 


1. There are three 1st order leading principal minors, (i = 1: eliminate 3—1 = 
2 rows and the same columns). 


a. Eliminate rows 1 and 2 and columns 1 and 2: M = [3]. Then 
|3| = 3 
b. Eliminate rows 1 and 3 and columns 1 and 3: M = [1]. Then 
| =4 
c. Eliminate rows 2 and 3 and columns 2 and 3: M = [1]. Then 
|2| = 2 
The 1st order leading principal minors are the entries of the main diagonal. 


2. There are three 2nd order leading principal minors, (¢ = 2: eliminate 
3 — 2 = 1 row and the same column). 





For a full discussion of determinants, see, e.g., Gilbert Strang’s Introduction to Linear 
Algebra. 
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a) Eliminate row 1 and column 1: M = a) . Then 
5 3 
FE 
b) Eliminate row 2 and column 2: M = ae . Then 
4 3 
pJ- 
c) Eliminate row 3 and column 3: M = 20 . Then 
0 1 
2 0 
b (|=? 


3. There is only one 3rd order leading principal minor, the determinant of H 
itself. (Eliminate no rows or columns.) Then 


|H| = —60 


Notice that the leading principal minors of a matrix are just the determi- 
nants of all the submatrices formed along the main diagonal. 

How do we use all these determinants, the leading principal minors, of the 
Hessian matrix to determine the convexity of the multivariate function? The 
following theorem answers our question. 


Theorem. Convexity and the Hessian Matrix. 
Suppose f(x1,22,...,2%n) has continuous second partial derivatives at all 
points of its domain D. Let H be f’s Hessian matrix. Then 


a. if all the leading principal minors of H are nonnegative, then f is convex; 


b. if all the leading principal minors of H match the sign of (—1)* where k 
is the minor’s order (positive for even order, negative for odd), then f is 
concave; 


c. otherwise, f is neither convex, nor concave. 


Example 4.4. Determining Convexity. 
Determine the convexity of the function f (£1, 72) = x? +323 using its Hessian 
matrix. 





3Maple uses the same notation for determinant. Enter the matrix H, then execute |H]. 
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1. The Hessian of f is 








a? f af 

Ox? Oxx 2 0 
e GE 

0x1 2x2 x2 


2. a. The 1st order leading principal minors are: |2| = 2 > 0 and |6| = 6 > 0. 
b. The 2nd order leading principal minor is |H| = 12 > 0. 


3. Since all leading principal minors are nonnegative, the Convexity and the 
Hessian Matrix theorem tell us that f is a convex function. 


Figure 4.2 shows f’s graph. 





FIGURE 4.2: Graph of f (1,22) = £? + 322 


Example 4.5. Determining Concavity Using the Hessian. 
Determine the convexity of the function g(x1, £2) = —x? — 3x2 + 2x2 using 
its Hessian matrix. 


1. The Hessian of g is 





07g 07g 
— Ox? Ərizə | | —2 1 
02122 0x2 
2. a. The lst order leading principal minors are: | — 2| = —2 < 0 & |- 6| = 


—6 <0. 
b. The 2nd order leading principal minor is |H| = 11 > 0. 
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3. Since the 1st order leading principal minors are negative and the 2nd order 
is positive, the Convexity and the Hessian Matrix theorem tells us that g 
is a concave function. 


Figure 4.3 shows g’s graph. 





FIGURE 4.3: Graph of 9(#1, 22) = —x? — 3x2 + 2122 


Example 4.6. Determining Nonconvexity Using the Hessian. 
Determine the convexity of the function h(a, £2) = £? — 3x3 +2122 using its 
Hessian matrix. 


1. The Hessian of h is 


Əh Əh 








əx? 0212 2 1 
H(h) = =a ao = j 4 
0x1 22 0x2 


2. The 1st order leading principal minors are: |2| = 2 > 0 & |—6| = —6 < 0. 
Stop. Since we have opposite signs, h cannot be either convex or concave; 
h has a saddle point. 


The function is neither convex nor concave; this Hessian matrix is called indef- 
inite. 
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Figure 4.4 shows h’s graph. 





FIGURE 4.4: Graph of h(x1, £2) = z? — 322 + 2122 


A quick trip to Maple verifies the analysis. 


|> with(Student[ VectorCalculus): # for ‘Hessian’ 
| with(LinearAlgebra) : # for ‘IsDefinite’ 
[> h:= (21,22) > 2? — 322 + 2122: 


[> H := Hessian(h(2x1, £2)), [£1, £2]); 
E 
1 —6 
> IsDefinite( H, query = indefinite ); 


true 





Part a of the Convexity and the Hessian Matrix Theorem (pg. 126) says 
a convex function has all nonnegative leading principal minors; the function’s 
Hessian is then called positive definite. Part b of the theorem says a concave 
function has all nonpositive leading principal minors; the function’s Hessian 
is then called negative definite. Part c gives us an indefinite Hessian. The 
calculation of definiteness becomes more involved when variables appear in 
the function’s Hessian. Let’s use Maple to test and verify definiteness for the 
functions of our examples. (Remember to load both Student|VectorCalculus] 
and LinearAlgebra packages.) 
[> f := (x,y) > r? +3- y: 
g := (x,y) > z? —3 -y +2- y: 
h := (x,y) > x? —3 -yY +r-y: 
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| > Hf := Hessian(f (2x1, £2), [£1, r2]); 
IsDefinite( Hf, query = ’positive_definite’); 
IsDefinite( Hf , query = ’negative_definite’); 
IsDefinite( Hf , query = ’indefinite’); 


of 


true 
false 
false 
| > Hg := Hessian(g(x1, x2), [£1, £2]); 
IsDefinite( Hg, query = ’positive_definite’); 
IsDefinite( Hg, query = ’negative_definite’); 
IsDefinite( Hg, query = ‘indefinite’; 


—2 1 
Hg := | 1 e 
false 
true 
L false 
[> Hh := Hessian(g(21, £2), [£1, £2]); 
IsDefinite( Hh, query = positive _ definite’); 
IsDefinite( Hh, query = negative _ definite’); 
IsDefinite( Hh, query = ’indefinite’); 
2 1 
mep 2 
false 
false 


true 





Maple has a number of functions that compute the Hessian of a function. 
Versions of the Hessian command are in the Student| VectorCalculus], Vector- 
Calculus, and LinearAlgebra packages. Consult Maple’s Help pages to help 
you decide which version is best to use in a given situation. 

In a more general setting, if a function has terms with power higher than 
2 or includes transcendental functions, such as 7173 or sin(z 22), the Hessian 
will have terms with variables. Then the analysis of definiteness for classifying 
a stationary point as a maximum, minimum, or saddle point is a bit more 
complicated. 

For two variables, we use the quadratic form 


Q(a1,%2) = [z1 a2] x H x | 


T2 


to test the Hessian matrix. In general, we use Q(Z) = Z7 H Z. Then define 
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Definition 4.2. Definiteness of a Matrix. 
Let H be f’s Hessian matrix. Then H is 


1. positive definite iff H > 0 for every nonzero vector z € domain( f), 
2. positive semi-definite iff H > 0 for every nonzero vector z € domain(f), 


. negative definite iff H < 0 for every nonzero vector z € domain(f), 


A Ww 


. negative semi-definite iff H < 0 for every nonzero vector z € domain(f), 
5. indefinite otherwise. 

We can relate definiteness to the signs of the leading principal minors. 
Positive Definite: All leading principal minors are positive. 


Positive Semi-definite: All leading principal minors are non-negative 
(some may be zero). 


k 


? 


Negative Definite: The leading principal minors follow the signs of (—1) 
where k is the order of the leading principal minor. Even order leading 
principal minors are positive, and odd order leading principal minors are 
negative. 


Negative Semi-definite: All nonzero leading principal minors follow the 
signs of (—1)*, where k is the order of the leading principal minor. Even 
order nonzero leading principal minors are positive, and odd order nonzero 
leading principal minors are negative. Some may be zero. 


Indefinite: Some leading principal minors do not follow any of the rules 
above. 


If the Hessian is also a function of the independent variables, its definite- 
ness might vary from one point to another. To test the definiteness of the 
Hessian at a point x*, evaluate the Hessian at the point. For example, in the 
matrix 

22, T 
H(zx1, £2) = | : | ’ 


T2 4 


the specific values of x; and x2 determine whether H is positive definite, 
positive semi-definite, negative definite, negative semi-definite, or indefinite 
at |x1, £2]. Give an example of each case! 

Table 4.2 summarizes the relationship between the Hessian matrix defi- 
niteness and the classification of stationary points (extrema) as maximum, 
minimum, saddle points, or ‘inconclusive.’ 
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TABLE 4.2: Summary of Hessian Definiteness 


Determinants: Hk 
= k order leading Results 
principal minors 


Conclusions About 
the Stationary Point 





Positive Definite, 


eee f is convex 


minimum 


Positive Semi-Definite, 


; local minimum 
f is convex 


All H; >0 


Negative Definite, 


; maximum 
f is concave 


sgn(Hy,) = sgn((—1)*) 


= 4h sinapo 
sgn( Hp) = sgn((—1)*) or Negative Semi-Definite, isca maximu 


some, not all, Hk = 0 f is concave 
H, not all 0, but Indefinite, i 
none of the above f is neither saddle point 
Ay = 0 for all k Indefinite inconclusive 


Example 4.7. Using the Summary Table. 


1. Suppose the Hessian of the function f(x1, £2) is given by 


Haaye a 


The 1st order leading principal minors are —2 and —4. The 2nd order 
leading principal minor is |H (a1, £2)| = 7. Since the first orders are nega- 
tive and the second order is positive, the leading principal minors follow 
the sign of (—1)*. The summary table indicates that f is concave and 
the stationary point found is a maximum. This Hessian matrix is negative 
definite. 


|> H := Matriz(||—2, —1], [—1, —4]]); 
—2 -1 
i E 
| > IsDefinite(H, query = ’negative _definite’; 
true 





. Suppose the Hessian of the function f is given by 
3 2 


1 
H := |6 4 
9 T 


5 
8 
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The 1st order leading principal minors are 3, 5, and 7. The 2nd order 
leading principal minors are 3, 12, and 3. The 3nd order leading principal 
minor, the determinant of H, is 0. All leading principal minors are positive 
or zero. The summary table indicates that f is convex and any correspond- 
ing stationary points found would be local minima. This Hessian matrix 
is positive semi-definite. 


[> H := Matriz(([3, 2, 1], [6,5,4], [9, 8, 7]]); 
3.2 1 
H:= |6 5 4 
L 9 8 7 
| > IsDefinite(H, query = ‘positive_definite ); 
false 


> IsDefinite(H, query = ‘positive_semidefinite’); 





true 


3. Suppose the Hessian of the function f is given by 


—2 — 
H= - ; E 
The 1st order leading principal minors are —2 and —3. The 2nd order 
leading principal minor is |H| = —10. Since both the first orders and 


the second order are negative, the summary table indicates the Hessian is 
indefinite and the stationary point found is a saddle point. 


|> H := Matriz(|[—2, —4], [—4, —3]]); 

—2 —4 
a 
| > IsDefinite(H, query = ‘indefinite’; 
E true 











4.2 Unconstrained Optimization: Examples 


The definitions and theorems from the previous section are put to work to 
solve a set of unconstrained optimization problems in the following examples. 
In the Maple sessions below, remember to start with a fresh document and to 
load the Student| VectorCalculus]| and Student|LinearAlgebra] packages. 


Example 4.8. Finding Extrema, I. 
Find and classify all the stationary points of 


f(£1, £2) = 55a, — 4a? + 135z2 — 1523 — 100. 
1 
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The computations for f are relatively straightforward, so we’ll do them by 
hand. 
The partial derivates are 





Of Of 
— = — — = 1 — 
Aa 55— 8r and Dy 35 — 30, 
and so the second partials are 
of of of 
Ox? " ðxrðy = Oy? 


Solving the system {fr(x1, £2) = 0, fy(£1, £2) = O} gives the single critical 
point x* = (55/8, 135/30). 
—8 0 
ma | 0 a 


The Hessian of f is 
The 1st order leading principal minors (LPM) for f are —8 and —30. The 2nd 
order LMP is |H| = 240 > 0. The 1st order LPMs are negative, the 2nd order 
is positive: by Table 4.2, the Summary of Hessian Definiteness, we see that f 
is concave at x*; therefore f(55/8, 135/30) = 392.8125 is the maximum. 


Example 4.9. Finding Extrema, II. 
Find and classify all stationary points of the function 


g(x1, x2) = 2gz1 £2 + 421 + 622 — 2x? = 2x2. 


We’ll go straight to Maple to analyze g. To avoid indexing issues, define 
g as a function of x and y. Then we can switch to xı and x2 immediately. 
Maple’s D operator uses “numeric” notation: 0/0x, = Dı and 0/022 = Də, 
etc. 


[> g := (x,y) > 2- £ -y + 4r + 6y — 2r? — 2y?; 


dx1 := Dı (g); 
dx2 := Də(g); 
ddx1 := Dı(dz1); 
ddx2 := Do(dx2); 
ddx1a2 := Do(dx1); 
g := (x,y) > 2ay + 4g + 6y + (—227) + (—2y") 
dzl := (x,y) = —4r + 2y + 4 
dx2 := (x,y) œ> 2a — 4y +6 
ddz1 := (x,y) + —4 
ddx2 := (x,y) œ> —4 
ddx1la2 := (x,y) => 2 
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| > solve([dx1(a1, x2) = 0, dx2(x1, x2) = 0], [x1, £2]); 
CP := rhs~ (%1); 


CP = E 3 


[> H:= Hessian(g(x1, £2), [£1, £2] ); 


arp 


> IsDefinite( H, query = ’negative_definite’); 

L true 

We see that the 1st order LPMs are —4 and —4, which follow (—1)!, and the 
2nd order LPM is 12, which also follows (—1)?. Thus, H is negative definite (at 
the point (7/3,8/3)), and so g is concave. It follows that g(7/3,8/3) ~ 12.667 
is a global maximum. See Figure 4.5. 








-4 -2 
0 2 
4 6 
E 8 


FIGURE 4.5: Graph of g(21, £2) 


Example 4.10. Finding Extrema, III. 
Find and classify all the stationary points of 


h(a, £2) = ry + ry — ry. 
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We will use Maple for the computations this time also. 





[> h:= (x,y) > x-y? +r- y- r- y 


> dx, := Dı (h); 
dx, := Do(h); 
dx, := (x,y) 327y +y? — y 
dz := (x,y) œ> T? + 2yx — T 





Our next step is to solve the system {0g/0x1 = 0,0g/0x2 = 0} to find critical 
points. In order to avoid complex roots and Maple’s RootOfs that occur in 
higher powers equations, we’ll solve in the RealDomain; i.e., using only real 
numbers. 


[> use RealDomain in 
CriticalPoints := solve([dx1 (x1, £2) = 0, dz2(£1, £2) = 0], [£1, £2]); 
end use: 
CP := map(rhs~, CriticalPoints) 


CP := [2 =| , | . =| (0, 1), 11,0), (1,0), (0.0) 


We have 6 stationary points that need to be tested with the Hessian to deter- 
mine if they are local maxima, local minima, or saddle points. 








| > Hessian(h(21, £2), [£1, £2)); 
H := unapply(%, x): # H will be a function of the vector x = (x1, £2) 


6 £2£1 3417 +2z2—1 
3417 +23- 1 224 





Now evaluate the Hessian and its determinant at each of the critical points. 
We’ll use a for loop to compute the Hessian at the different critical points 
successively. Recall that the syntax of this type of loop is 


for var from start_value to stop_value do 
statements 
end do 


If “from start_value” is omitted, the start_value is assumed to be 1. 

On to the analysis. In each line, list the critical point, the Hessian at that 
point, and the Hessian’s determinant, which is the 2nd order LPM, at that 
critical point. 
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|> for i to 6 do 
CP, H(CP;), Determinant(H(CP;)); 





end do; 
12/52 
$ d 25 5 | 4 
575’ 2 2/5 "5 
5 5 
—12V5 2 
=" | 25 5 4 
5 5 ? 2 2/5 HB? 
5 o5 
0 1 
ofi ah- 


T 
n 
2 
l F] | TE) 
D 
N 
a | 
| 
Aa 





(0,0), b J = 


We used Maple’s Determinant command from the Student|LinearAlgebra] 
package to compute |H| giving us the 2nd order LPMs for each critical point. 
The Minor command will give H’s 1st order LPMs as seen in 





| > Minor(H(CP,),1, 1); 
Minor(H( CP), 2, 2); 
2V5 
EA 
12/5 
“a. 


Determinant, IsDefinite, and Minor are all in the Student|LinearAlgebra] 
package. How can the for loop above be modified to determine the definiteness 
of the Hessian at each critical point? 

Table 4.3 shows the data collected for h in a chart for easier analysis. 
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TABLE 4.3: Analysis of h’s Critical Points 


Leading Principal Minors 














Critical Classification 
Point and Results 
1st Order LPMs 2nd Order LPM 
neither, 
(0,0) 0, 0; [> 0] -1 saddle point 
neither, 
(1,0) 0, 0; [> 0] = saddle point 
neither, 
(0, 1) 0, 0; [> 0] =i saddle point 
7 ee E neither, 
(—1,0) 0, —2; [~ (—1)"] 4 saddle point 
(2 2) ie age. > 0] 4 convex, 
55 Bo? BLS 5 local minimum 
concave, 





( 5,2) 12v5, 2v5, a [~ (—1)?] 


= 
i 
as 
a 
ous 


local maximum 


Using Maple to verify the definiteness of each Hessian is easy. 


| > IsDefinite(H(CP,), query = ‘positive _definite’); 
IsDefinite(H(CP2), query = ’negative_definite’): 
true 
true 


> seq(IsDefinite(H(CP,), query = indefinite’), k = 3..6); 


true, true, true, true 





We find the maximum and minimum points are h(—V/5/5,2/5) ~ 0.072 
and h(V/5/5,2/5) ~ —0.072. Finally, a graph of h is shown in Figure 4.6 with 
the maximum and minimum points marked as spheres and the saddle points 
as boxes. Carefully zoom in on each critical point in a graph of h to verify the 
behaviors claimed. 


Unconstrained Optimization: Examples 139 





FIGURE 4.6: Graph of h(a1, £2) and its Critical Points 


We could also use Maple’s Minor command from the Student|LinearAlgebra] 
package in the form 


| > Minor(Minor(M, 1,1, output = matrix), 1,1); 





to compute LPMs of larger matrices M; however, the syntax rapidly gets 
complicated as the number of leading principal minors gets large quickly. 

Tides change the sea level around an island. Local tidal variation is influ- 
enced by a number of factors many of which are based on local topography. An 
island in a harbor is a navigational hazard. In the next example, we look for 
an island using a function based on bathymetric data of the sea bed combined 
with land height data. 


Example 4.11. Finding an Island. 
The bathymetric and topographic data sets combined lead to the sea and land 
surface model 








f (x1, £2) = —300 z3 — 695 23 + 7 £2 — 300 x} — 679 x? — 235 xı + 570 


for (x,y) € [—2..2] x [—2..2]. 


Going right to Maple using our templates from the previous examples, we 
find: 
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[> f := (x,y) + —300 y? — 695 y? + 7 y — 300 z? — 679 x? — 23524570: 
dx := D,(f) 
dy := Do(f) 








? 
7 





dx := (x,y) œ> —900x? — 1358x — 235 
L dy := (x,y) > —900y? — 1390y + 7 
|> Hessian( f(x,y), |x, y]); 
H := unapply(%, |x, y]) : 

—1800 z — 1358 0 


0 —1800 y — 1390 


[> use RealDomain in 
CriticalPoints := evalf ;(solve(|dz(x, y) = 0, dy(x, y) = 0], [x, y])) : 


end use: 
CP := map(rhs ~, CriticalPoints); 
CP := |[—0.19940, —1.5495], [—1.3095, —1.5495], [—0.19940, 0.00504], 


[—1.3095, 0.00504]] 


> for i to 4 do 
CPi, f(op(CP:)), H(op(CPi)); 





end do; 
—999.08000 0 
[—0.19940, — 1.5495], 28.8149803, 
0 1399.1000 
999.1000 0 
[—1.3095, —1.5495], —176.379930, 
0 1399.1000 
—999.08000 0 
[—0.19940, 0.00504], 592.2577678, 
0 —1399.07200 
999.1000 0 
[—1.3095, 0.00504], 387.0628571, 
0 —1399.07200 
| > IsDefinite(H(op(CP3)), query = ’negative_definite’); 
E true 
We find that at CP3, x = —0.1994 and y = 0.0050, the height is 


f(—0.1994, 0.0050) ~ 592.2578. The Hessian H(—0.1994, 0.0050) is negative 
definite at that point indicating that we have found the maximum. Assuming 
that f(x,y) = 0 is sea level, we have found that an island which is 592.2578 
feet above sea level exists in the harbor. That’s a significant island for ships 
to avoid! See Figure 4.7. 
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=2 


(a) Surface Data (b) Contours 


FIGURE 4.7: Topographical Plot 


For our last example in this section, we turn to an interesting and quite use- 
ful application of multivariate optimization: finding the least squares regres- 
sion line fitting a data set. We’ll begin by theoretically finding the regression 
line using what we’ve developed for optimization. Then we’ll step through the 
computations in Maple to find the least squares regression line for a small 
data set. 


Example 4.12. Finding the Least Squares Regression Line. 
In using least squares regression to fit a line y = a+ bx to a set of n data 
points {(x;,yi)}, we minimize the sum of squared errors 


n 


SSE = 5 (yi— (a+ bai)”. 


i=l 


the variables are a and b. 
Begin by finding the system of equations that will determine the critical 
points. First, 


of < 





Db > 2(vi — (a+ bx;))(—z;) 
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Then {Of /da = 0,0f/0b = 0} holds when the Normal Equations for Least 
Squares hold 


n n 
na+b 5 r= 5 Yi 
i=1 i=1 
n n n 
að a; + b> x? = X wins 
i=1 i=1 i=1 
Remember, the summations are all constants once given a data set. 


The Hessian is n 
Qn 2S x 
H= , = 
i=1 i=1 


So the 1st order LPMs are 2n and 2 5 x? which are both greater than 0. The 
i=1 


n n 2 
|H| = nYa? = yo] 
i=1 i=1 


which can also be shown to be greater than zero. Therefore, H is positive 
definite, and the critical point (a,b) must be a minimum. 

Let’s look at this optimization computation in Maple with a very small 
data set: 


2nd order LPM is 


X/1 2 3 
Y|2 48 7 


Once more, we go right to Maple. 


[> X,Y := [1,2,3], [2, 4.8,7] : 
ni=3: 
sse := sum((Y; — (a+ b- X;))?,i = 1..n); 
SSE := unapply(sse, |a, b]) : 
L sse := (—a — b +2)? + (—a — 2b + 4.8)? + (—a — 3b + 7)? 
|> da := D,(SSE); 
db := D(SSE); 

da := (a,b) + —27.6 + 6a + 12b 
L db := (a,b) œ> —65.2 + 12a + 28b 
|> solve(|da(a,b) = 0, db(a, b) = 0], [a, b]); 
CP := %1; 

[[a = —0.4000000000, b = 2.500000000]] 
CP := [a = —0.4000000000, b = 2.500000000] 





Unconstrained Optimization: Examples 143 


| > H := Hessian(SSE(a, b), {a, b}); 

k A 

12 28 

> IsDefinite( H, query = ’positive_ definite’); 

true 

[> L:= subs(CP,a+ ba); 

L L := 2.500000000 x — 0.4000000000 


The Hessian matrix is positive definite, so we have found the minimum for 
SSE, the sum of the square errors. The linear regression line for the small data 
set is y = 2.57 — 0.4. The graph in Figure 4.8 shows how well the regression 
line fits the data. 














FIGURE 4.8: Regression Line with Data 





Exercises 


1. Indicate both the definiteness: positive definite, positive semi-definite, neg- 
ative definite, negative semi-definite, or indefinite of the given Hessian 
matrices and the concavity of the function f from which these Hessians 
were derived. 
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3 5 3 2 1 -2 

-3 4 6r 0 xz? 0 
inal j ote 7 pisi a] 

2 2 3 1 2 2 Qe 2 2 
gH=|2 6 4) hAH=/2 1 2| i=] 2 -12% -1 

34 4 22 i 2 -1i 182 


2. Determine the convexity of the given function using the Hessian matrix 
H. Find and classify the function’s critical points. 





e. f(x,y, z) = 2x + 3y + 3z — zy + zz — yz — 2? — 3y? — 2 





a. f(x,y) = 27 + 32y — y? 

b. f(z, y) =r +y’ 

c. f(x,y) = x? — ry — 2y? 

d. f(x,y) = 3x + 5y — 4r? + y? — Bay 
( 


3. Determine values of a, b, and c such that g(x,y) = ax? + bzy + cy? is 
convex. Determine the values where g is concave. 


4. Find and classify all the extreme points for the following functions. 











(a) f(x,y) = £? + 3ay — y? 

(b) f(x,y) = +y? 

(c) f(z, y) = =2?° — wy — 2y? 

(d) f(x,y) = 3x + 5y — 4x? + y? — 5xy 

(e) f(x,y) = 3x? + 7x? + 22 + 3y? + Ty? —y—5 

(£) f(x,y, z) = 2z + 3y + 3z — zy + xz — yz — 2? — 3y? — 2? 








5. Find and classify all critical points of f(x, y) = e@- + a? + y?. 


6. Find and classify all critical points of f(x,y) = (x? + y”)!° — 4(a? + y?). 





7. Three oil wells are located at coordinates (0,0), (12,6), and (10, 20). Each 
well produces an equal amount of oil. A pipeline is to be laid from each 
oil well to a central refinery located at (a,b). Where should the refinery 
be located to minimize the total squared Euclidean distance? 


3 


d = Se — aj)? + (yi — b)? 


i=1 


That is, determine the optimal location (a, b). 
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8. 


10. 


11. 





A small company is planning to install a central computer with cable links 
to five departments. According to the floor plan, the peripheral computers 
for the five departments will be situated as shown by Figure 4.1 (pg. 122) 
based on the coordinates listed in Table 4.1. The company wishes to locate 
the central computer to minimize the total amount of cable used to link to 
the five peripheral computers. Assuming that cable may be strung over the 
ceiling panels in a straight line from a point above any peripheral to a point 
above the central computer, the standard distance formula may be used 
to determine the length of cable needed. Ignore all lengths of cable from 
the computer itself to a point directly above the ceiling panel over that 
computer. That is, work only with straight lengths of cable strung above 
the ceiling panels. Find the optimal location for the central computer. 


. The first partials of a function f(x,y) evaluated at (0,0) are Of /Ox = —5 


and Of /Oy = 1. The Hessian of f at (0,0) is 
6 -1 
(0,0) = E 2 | 


Use your knowledge of partial derivatives and Hessians to find the point 
(x*,y*) that minimizes f. What is the minimum value? Show all work. 


The water depth in a region where we want to build a port is given by the 


function 
x — 2y 


k = —_—; 
(x,y) 1 + 2x? + y? 
Identify the general location of any entry points that will cause problems 
for ships entering into the region. 


The water depth in a region where we want to build a port is given by the 
function 
10(2 + y)* 
11 — 30x — 25x? 
— 20(3 + y) — 30 cos(10 + (6 + y)”) + 4sin(28 + 3x + 6y + xy) 








f(z,y) = 10(3 + x)? + 3(6 + x) — 20(2 + y)? 


Identify the general location of any entry points that will cause problems 
for ships entering into the region. 





4.3 Unconstrained Optimization: Numerical Methods 


We’ve been investigating analytical techniques to solve unconstrained 
multivariable optimization problems 


max f(21,2,--.,2n). (4.1) 
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In many problems (most real problems!), it is quite difficult to find the sta- 
tionary or critical points, and then use them to determine the nature of the 
stationary point. In this section, we'll discuss numerical techniques to maxi- 
mize or minimize a multivariable function. 


Gradient Search Methods 


We want to solve the unconstrained nonlinear programming problem (NLP) 
given in Equation 4.1 above. Calculus tells us that if a function f is concave, 
then the optimal solution—if there is one—will occur at a stationary point 
x*; i.e., at a point x* where 


ee Ofa OF iu 
TA =n" hae )=0 
Finding the stationary points in many problems is quite difficult. The methods 
of Steepest Ascent (maximization) and Steepest Descent (minimization) offer 
an alternative by finding an approximation to a stationary point. We’ll focus 
on the gradient method, the Steepest Ascent. 

Examine the function shown in Figure 4.9. 





-1510 





-5 
05 
10 154 2 


y x 
FIGURE 4.9: A Surface Defined by a Function 


We want to find the maximum point on the surface. If we started at the 
bottom of the hill, we might proceed by finding the gradient vector, since the 
gradient is the vector that points “up the hill” in the direction of maximum 
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increase. The gradient vector is defined as 


_ [8fC) AFC) f(x) 


YA) = Ox, > Ox?” Oxy | 











(The symbol V is called “del” or “nabla”.*) 

If we were lucky, the gradient would point all the way to the maximum of 
the function, but the contours of functions do not always cooperate—actually, 
they rarely do. The gradient “points uphill,” but for how far? We need to find 
the distance along the line given by the gradient to travel that maximizes the 
height of the function in that direction. From that new point, we compute a 
new gradient vector to find a new direction that “points uphill” in the direction 
of maximum increase. Continue this method until reaching to the “top of the 
hill,” the maximum of f. 

To summarize: from a given starting point, we move in the direction of the 
gradient as long as f’s value continues to increase. At that point, recalculate 
the gradient and move in the new direction as far as f continues to improve. 
This process continues until we achieve a maximum value within some spe- 
cific tolerance (margin of acceptable error). The algorithm for the Method of 
Steepest Ascent using the gradient is: 





Method of Steepest Ascent Algorithm 


Find the unconstrained maximum of the function f : R” > R. 
INPUTS: Function: f 
Starting point: xo 
Tolerance: € 
OUTPUTS: Maximal point x* and maximum value f(x*) 


Step 1. Initialize: Set x = xo 
Step 2. Calculate the gradient g = V f(x) 


Step 3. Compute the maximum t* of the 1-variable function 
o(t) = f(x+t-g) 

Step 4. Find the new point Xnew = X + t*- g = x + t*- Vf (x) 

Step 5. If ||k — Xnew|| < £ OR ||Vf(x)]| < £, then STOP 


and return estimates x* = Xņnew and f(Xnew). 
Otherwise, set X = Xnew and return to Step 2. 














Remember that ||y|] = || (y1; Y2- <- Yn) | = VY? tus + + yp. 
It’s time for several examples using the method of steepest ascent. 





4Wikipedia https://en.wikipedia.org/wiki/Nabla_symbol has a humorous history of V. 
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Example 4.13. Steepest Ascent Example, I. 
Maximize f(x) = 22,22 + 2£2 — x} — 2x2 to within £ = 0.01. 


We start with xo = (0,0). 
ITERATION 1. 
The gradient of f (x1, 22), V f, is found using the partial derivatives; f’s gradi- 
ent is the vector V f (a1, x2) = (2%2—221, 24,+2—422). Then V f (0,0) = (0, 2). 
From (0,0), we move along (up) the x-axis in the direction of (0,2); that is, 
along the line L(t) = xo + Vf(xo) -t = (0,0) + ¢(0,2) = (0, 2t). How far do 
we go? 

We need to maximize the function (t) = f(0,2t) = 4t — 8t? starting 
at t = 0 to find how far along the line L(t) to step. This function can be 
maximized by using any of the one-dimensional search techniques for single- 
variable optimization from Chapter 3 or by simple calculus. 

a =4-—16t=0 when t* =0.25. 
dt 
Then L(t*) gives the new point x; = L(0.25) = (0,0.5). 

The magnitude of the difference (xp — x1) is 0.5 which is not less than our 
desired tolerance of 0.01 (chosen arbitrarily at the beginning of the solution). 
Since we are not optimal, we continue. Repeat the calculations from the new 
point (0,0.5). 

ITERATION 2. 
The gradient vector g = Vf((0,0.5)) = (1,0). Now L(t) = (0,0.5) + (1, 0)¢. 
Again, how far do we travel along L to maximize ¢(t) = f((t,0.5)) = —t? + 
t+0.5 ? Simple calculus gives 

dp =-2t+1=0 when ¢t* =0.5. 

dt 
Then, the new x is x2 = L(0.5) = (0.5,0.5). 

The magnitude of the difference (x; — x2) is 0.5 which is still not less than 
our desired tolerance of 0.01. The magnitude of V f(x1) = 1 which is also not 
within our tolerance 0.01. We continue to iterate. 

Maple will continue the iterations for us using the function SteepestAscent 
which is in the book’s PSMv2 package. Load the package via with(PSMv2). 

Define the function using vectors. 


[> f := £ > 2z122 + 2x2 — T? — 222 : 


The syntax of our SteepestAscent function is 





SteepestAscent( f, (xo, Yo), €) 


Adding a third argument tells SteepestAscent to produce a graph of the path 
Xo, X1, ---, Xn. The output of SteepestAscent is a DataFrame containing the 
generated points, gradients, function value, and step-size. 
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[> SAf := SteepestAscent(f, (0,0),0.01, graph); 





SAJ := 





eR 


N 


w 


a 


or 


aD 


“I 


oo 


oOo 


Pts 


Gradient 


Fen 


0.5000 


0.7500 


0.8750 


0.9375 


0.9699 


0.9844 


0.9922 


0.9961 


0.9980 


Grad_Step 


0.2500 


0.5000 


0.2500 


0.5000 


0.2500 


0.5000 


0.2500 


0.5000 


0.2500 


0.5000 


14 x 4 DataFrame 
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To see the list of points generated, use SAf[’Pts’]; to see just the last point 
and its function value, use SAf|’Pts’][-1], SAf[’Fcn’][-1]. 

[> SAf[’Pts’][-1], SAf[’Fen’][-]]; 


0.9844 
ea „0.9999 





The multivariable calculus solution is straightforward to compute by solv- 
ing the system {Of/Ox, = 0,0f/Ox2 = 0}, and checking f’s Hessian at the 


critical point. 





or = 2X9 = 221 = 0 

oF => x* = (£1, £2) = (1,1) 
= 27, + 2 — 4z =0 

Ox, 


The Hessian 


> 


is negative definite, so the point x* = (1,1) is a maximum with f(x*) =1. 
To get a closer approximation with the steepest ascent method, we would 


make our tolerance smaller. A look at f’s contour plot confirms a hill at 


approximately (1,1) in Figure 4.10. 





Vi 


FIGURE 4.10: Contour Plot of f 
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Example 4.14. Steepest Ascent Example, II. 
Maximize g(x) = 55a, — 4x? + 13522 — 1523 — 100 using the Steepest Ascent 
method starting at the point (1,1) to within a tolerance of € = 0.01. 





Figure 4.11 provides a visual reference for the maximum. 





FIGURE 4.11: Surface Plot of g Showing Contours 


The gradient is Vg(x) = (55 — 821,135 — 3022). 
ITERATION 1. 
We begin with xp = (1,1). Then Vg(xo) = (47,105). From (1,1), we move 
in the direction of (47,105). How far do we go along the line L(t) = (1,1) + 
(37, 105)t? We need to maximize the function 


g(L(t)) = g(xo + t - Vg(xXo)) 
4(1 + 47t)? — 15(1 + 105t)? + 90 + 16760t 
= —174211t? + 13234t + 71. 








This function can also be maximized by simple single-variable calculus: 


£ grt) 0 > t" = 0.03798 








The new point x; is found by evaluating L(t*) = xo + t* - Vg(xo). 
xı = (1,1) + t* - (37,105) = (2.785, 4.988) 


Since ||x 1 — xo|| > 0.01, iterate again. 
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ITERATION 2. 

Now compute Vg(x1) = (82.719, —14.645). We move from (2.785, 4.988) in 
the direction of (32.719, —14.645). How far do we go along the line L(t) = 
(2.785, 4.988) + (32.719, —14.645)t? We need to maximize the function 


g(L(t)) = g(x + t- Va(x1)) 
= 322.33 + 1285.0t — 7499.3¢?. 
This function can also be maximized by simple single-variable calculus: 


£ grt) 0 t* = 0.08567 








The new point xə is found by evaluating L(t*) = xı + t* - Vg(x1). 
X2 = (5.588, 3.733) 


Since ||x2 — x,|| > 0.01, we must iterate again. 
Use our SteepestAscent function to have Maple finish the iterations. Define g 
as a function of the vector x. 


[> g := x —4a? — 15x3 + 55a, + 13522 — 100: 
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[> SAg := SteepestAscent(g, (1,1), 0.01, graph); 





Pts Gradient Fen  Grad_Step 


i 


— 


| au 0.03798 


105. 
2 | ae 322.3 0.08567 
3 bey a 377.4 0.03798 
4 Ro iy 389.4 0.08567 
5 hes Fal 392.1 0.03798 
L 6.679] [1.57 
6 | o ka 392.7 0.08567 
7 ee al 392.8 0.03798 
8 ee ies] 392.8 0.08567 
9 lee, Fal 392.8 0.03798 
10 hee el 392.8 0.08567 





11 x 4 DataFrame 
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The final point xıı and the g’s maximum value are: 


[> SAgl’Pts’][-1], SAg[’Fen’\[—-1; 
6.872 
4.498 





| 392.8 


This time, the solution process zig-zagged and converged more slowly to 
an optimal solution as illustrated in Figure 4.12. 


FIGURE 4.12: Zig-Zagging to a Solution 


We’ll now look at maximizing a transcendental multivariable function 
where the critical points cannot be found analytically in a closed form, but 
must be approximated numerically. 


Example 4.15. Steepest Ascent Example, ITI. 
Maximize h(x) = 2a 22 + 2x2 — e”! — e7? + 10 using the Steepest Ascent 
method starting at the point (0.5,0.5) to within a tolerance of € = 0.01. 





Verify that h has a local maximum near (1, 1.25) by looking at a graph. 
The gradient of h is 


a 


h(x) = (2a — e”! , 2x1 + 2 — e”?) 


ITERATION 1. 

We begin with xo = (0.5,0.5). Then Vh(xo) = (—0.649,1.351). From 
(0.5,0.5), we move in the direction of (—0.6487, 1.3513). How far do we go 
along the line L(t) = (0.5, 0.5) + (—0.6487, 1.3513)}t? We need to maximize the 
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h(L(t)) = g(xo + t - Vg(xo)) 
= 11.5 + 3.405t — 1.753t2 _ e(0-5—0.6487t) _ e(0-5+1.351t) 


Maple’s fsolve gives t* = 0.2875. Therefore 
X1 = Xo + t* - Vh(xo) = (0.3136, 0.8883). 


Since ||x1 — xo|| > 0.01, we continue. But let Maple do the work. 





[> h:= x > 2z1£2 + 2£2 — ezp(x1) — erp(x2) +10: 


[> SAh := SteepestAscent(h, (0.5, 0.5), 0.01); 
Pts Gradient Fcn Grad_Step 


1 3] D] w ozs 
2 fees o 8.534 1.704 
aral” fed fed 8.773 0.2001 
a [SP] [ass] se ore 
5 [i586] [ooi] 8827 0186s 
i 6 fee e 8.828 0. 


The final point xg and the h’s local maximum value are: 


| > SAh|’Pts’|[—-1], SAA ’Fen’][—1]; 
1.025 
1.397 





| , 8.828 


Modified Newton’s Method 


The zig-zag pattern we saw in Figure 4.12 shows that steepest ascent doesn’t 
always go directly to an optimum value. The Newton-Raphson iterative root 
finding technique using the partial derivatives of the function provides an 
alternative numerical search method for an optimum value when modified 





5Newton invented the method in 1671, but it wasn’t published until 1736; Raphson 
independently discovered the method and published it in 1690. 
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appropriately. Given the right conditions, this numerical method is more effi- 
cient and converges faster to an approximate optimal solution: quadratic con- 
vergence versus the linear convergence of steepest ascent. 

Newton’s Method for multivariable optimization searches is based on the 
single-variable root-finding algorithm. Modify the procedure to look for roots 
of the first derivative rather than roots of the original function: 


(IC) nae CA 


and iterate until |vp41 — £n| < € for our chosen tolerance € > 0. 
Extend the Modified Newton’s Method to several variables by using gra- 
dients and the Hessian, the matrix of second partial derivatives. 








Tn+1 = Tn + 





Eny = En + (f"(En)) ™- f'(@n) => Xapi = Xn + (H(xn)) - VE(%n) 


now iterating until ||x,41 — Xn || < £ for our chosen tolerance £ > 0. Applying 
a little bit of linear algebra gives us the method as an algorithm. 





Modified Newton’s Method Algorithm 


Find the unconstrained maximum of the function f : R” > R. 
INPUTS: Function: f 


Starting point: xo 

Tolerance and Maximum Iterations: €, N 
OUTPUTS: Maximal point x* and maximum value f(x*) 
Step 1. Initialize: Set x = xo 
Step 2. Calculate the gradient g = V f(x) and H = Hessian(x) 
Step 3. Compute d = |H| and create new matrices 


H; = substitute g for column 1 of H 
Hə = substitute g for column 2 of H 


Step 4. Compute: Azı = |Mi|/d and Azz = |H2|/d. 


Step 5. Find the new point Xnew = (£1 + Avy, £2 + Axe) 
Step 6. If ||(Aa, Axg)|| < £, then STOP 


and return estimates x* = Xnew and f(Xnew). 
Otherwise, set X = Xnew and return to Step 2. 











Again, remember that ||y|| = ||(y1, y2)|| = Vy? + y3- 
Let’s repeat our examples for the steepest ascent method using Newton’s 
method. We’ll use the function ModifiedNewtonMethod which is in the book’s 
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PSMv2 package. Load the package via with(PSMv2). The syntax of the Mod- 
ifiedNewtonMethod function is 


ModifiedNewtonMethod( f, (xo, Yo), £, N) 


where f is the function, (£o, yo) is the starting vector, £ is the tolerance, and 
N is the maximum number of iterations allowed. The output of ModifiedNew- 
tonMethod is a DataFrame containing the generated points, function values, 
Hessians, and the definiteness of the Hessian. 


Example 4.16. Modified Newton’s Method Example, I. 
Maximize f(x) = 2122 + 242 — xr? — 223 to within € = 0.01 starting at 
xo = (0, 0). 


Limit the number of iterations to N = 20. 
[> f := z > 2r122 + 2x2 — T? — 222 : 


[> MNf := ModifiedNewtonMethod( f, (0,0), 0.01, 20); 
7 a 


7 FA A negative _ definite 


ii = 


We have a maximum of f of 1 at (1,1) since the Hessian is negative definite 
there. 


On to the second example. 


F(x.) Hessian definiteness 


2. . ; 
"i = | negative _ definite 


SAf := 


N 


1 
i 4 
3 








ac a negative _ definite 
1 


Example 4.17. Modified Newton’s Method Example, II. 
Maximize g(x) = 55a, — 47? + 1352 — 15x2 — 100 starting at the point (1,1) 
to within a tolerance of e = 0.01 limiting iterations to N = 20. 








[> g := x 55a, — 4x? + 13529 — 1522 — 100: 


[> MNg := ModifiedNewtonMethod(g, (1,1), 0.01, 20); 


Lk F' (xx) Hessian definiteness 
1. —8. 0. ; ; 
1 K 71. | 0. +0 negative_ definite 


SAg := o [6-875 
4.500 


—8. 


392.9 | a negative _ definite 


= 








6.875 —8. 0. : f 
Re al 392.9 i a negative_ definite 
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We have a maximum of g of 392.9 at (6.875, 4.500) since the Hessian is negative 
definite there. 
Now, for the third example. 


Example 4.18. Modified Newton’s Method Example, ITI. 
Maximize h(x) = 2a, a2 + 2x2 — e”! — e”? + 10 starting at the point (0.8, 0.8) 
to within a tolerance of ¢ = 0.01 with no more than N = 20 iterations. 














[> h := £ > 2£1£2 + 2x2 — e”! — e72? +10: 
[> MNg := ModifiedNewtonMethod(h, (0.8, 0.8), 0.01, 20); 
Lk F (ax) Hessian definiteness 
1 os 8.428 p ~ 9. a negative _ definite 
2 bee 3.306 M Pii E negative _ definite 
3 P 7.851 M Ak en negative _definite 
eae 4 ee 8.723 p — E negative _ definite 
5 | 8.825 p - E negative _ definite 
6 m 8.829 M ~ e negative _ definite 
7 | 8.827 E zene al negative_ definite 


We have a maximum of h of 8.827 at (1.031, 1.402) since the Hessian is negative 
definite there. 

Even though Newton’s method is faster and more direct, we must be cau- 
tious. The method requires a relatively good starting point. Try searching for 
a maximum for h starting at (0.5,0.5) and looking at just the last entry in 
the DataFrame report. 





[> MNh2 := ModifiedNewtonMethod(h, (0.5, 0.5), 0.01, 20) : 
MNh2[-1, ..]; 
m a 
0.3829 
F(x) 8.329 
oe —0.7659 2. | 
a —1.467 
definite _ness indefinite 
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We started at (0.5,0.5), not far from our original (0.8,0.8). But the Hessian 
being indefinite tells us that we’ve found an approximate saddle point, not 
a maximum. Modified Newton’s method is very sensitive to the initial value 
chosen. 


Comparisons of Methods 
We compared these two routines finding that Modified Newton’s Method con- 
verges faster than the gradient method. Table 4.4 shows the comparison. 


TABLE 4.4: Comparison of the Steepest Ascent and Modified Newton’s Meth- 
ods 














f Initial Value Iterations Solution max f(x) 
Steepest Ascent (0, 0) 16 (0.9922, 0.9961) 1.0 
Modified Newton (0,0) 2 (1.0, 1.0) 1.0 

g Initial Value Iterations Solution max g(x) 
Steepest Ascent (0, 0) 4 (—0.266,0.385) 8.3291 
Modified Newton (0, 0) 2 (—0.269, 0.381) 8.3291 

h Initial Value Iterations Solution max h(x) 
Steepest Ascent (0.5, 0.5) 6 (1.025, 1.397) 8.828 
Modified Newton (0.8, 0.8) 7 (1.031, 1.402) 8.827 


As a final comparison, add the modified Newton’s method points to the 
contour map showing the steepest ascent points for g of Figure 4.12 to obtain 
Figure 4.13 below. 
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7 ae 





FIGURE 4.13: Steepest Ascent and Modified Newton’s Method Solutions 


When given a good starting value, a Modified Newton’s method search is 
much faster and more direct than a steepest ascent gradient method. 








Exercises 
1. Maximize f(x) = 2122 — 2x? — 2x3 to within a tolerance of 0.1. 


a. Start at the point x = (1,1). Perform 2 complete iterations of the 
steepest ascent gradient search. For each iteration clearly show x, 
Xn41, Vf (Xn), and t*. Justify that the process will eventually find 
the approximate maximum. 

b. Use Newton’s method to find the maximum. Starting at x = (1,1). 
Clearly show Xn, Xn41, Vf(Xn), and the Hessian for each iteration. 
Indicate when the stopping criterion is achieved. 


2. Maximize f(x) = 32122 — 4a} — 4x2 to within a tolerance of 0.1. 


a. Start at the point x = (1,1). Perform 2 complete iterations of the 
steepest ascent gradient search. For each iteration clearly show xp, 
Xn41, Vf(Xn), and t*. Justify that the process will eventually find 
the approximate maximum. 

b. Use Newton’s method to find the maximum. Starting at x = (1,1). 
Clearly show Xn, Xn41, Vf(Xn), and the Hessian for each iteration. 
Indicate when the stopping criterion is achieved. 
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3. Apply the modified Newton’s method to find the following: 
a. Maximize f(x,y) = —x® + 3x + 84y — 6y? starting with the initial 
value (1,1). Why can’t we start at (0,0)? 
b. Minimize f(x,y) = —4x + 4a? — 3y + y? starting at (0,0). 


c. Perform 3 modified Newton’s method iterations to minimize f(x, y) = 
(x — 2)* + (x — 2y)? starting at (0,0). Why is this problem not con- 
verging as quickly as b? 


4. Use a gradient search to approximate the minimum to f(x) = (a1 — 2)? + 
x1 + 23. Start at (2.5, 1.5). 


E: See eee 


Projects 


Project 4.1. Modify the Steepest Ascent Algorithm (pg 147) to approximate 
the minimum of a function. This technique is called the Steepest Descent 
Algorithm. 


Project 4.2. Write a program in Maple that uses the one-dimensional Golden 
Section search algorithm instead of calculus to perform iterations of a gradient 
search. Use your code to find the maximum of 


f(z, y) = vy — 2? — y? — 2x — 2y +4 





Project 4.3. Write a program in Maple that uses the one-dimensional 
Fibonacci search algorithm instead of calculus to perform iterations of a gra- 
dient search. Use your code to find the maximum of 





f(z, y) = zy — 2° — y?’ — 2x — 2y +4 
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4.4 Constrained Optimization: The Method of Lagrange 
Multipliers 


Equality Constraints: Method of Lagrange Multipliers 


A company manufactures new phones that are projected to take the market 
by storm. The two main input components of the new phone are the circuit 
board and the LCD Touchscreen that make the phone faster, smarter, and 
easier to use. The number of phones to be produced F is estimated to equal 
E = 250a'/4b!/? where a and b are the number of circuit board production 
hours and the number of LCD Touchscreen production hours available, respec- 
tively. Such a formula is known to economists as a Cobb-Douglas function. Our 
laborers are paid by the type of work they do: the circuit board labor cost is 
$5 an hour and the LCD Touchscreen labor cost is $8 an hour. What is the 
maximum number of phones that can be made if the company has $175,000 
allocated for these components in the short run? 
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Problems such as this can be modeled using constrained optimization. 
We begin our discussion with optimization with equality constraints, then we 
move to optimization with inequality constraints in the next section. 

Lagrange multipliers® can be used to solve nonlinear optimization problems 
(NLPs) in which all the constraints are equations. Consider the NLP given by 


Maximize (Minimize) z = f(x) 


subject to (4.2) 
gi(x) = by 
g2(x) = b2 
Im(x) = bm 


where m < n. 
We can build an equality constrained model for our phone problem: max- 
imize the number of phones made using all available production hours. 


Maximize E = 250 a!/4p!/2 
subject to 
5a + 8b = 175000 


Lagrange Multipliers: Introduction and Basic Theory 


In order to solve NLPs in the form of (4.2), we associate a Lagrange multiplier 
A; with the ith constraint, and form the Lagrangian function 


L(x, 4) = F) + Y (gil) — b) (4.3) 


The computational procedure for Lagrange multipliers requires that all the 
partials of the Lagrange function (4.3) must equal zero. The partials all equal- 
ing zero at x* form the necessary conditions that x* is a solution to the NLP 
problem; i.e., these conditions are required for x = (x1,%2,...,2n) to be a 
solution to (4.2). We have 


Proposition. Lagrange Multiplier Necessary Conditions. 
For x* to be a solution of the Lagrange function, all partials must satisfy 





L 
ð =0 fori=1,2,...,n (variables) 
Ox; 

L 
ox =0 for 7 =1,2,...,m (constraints) 


at x*. 





For a nice history of Lagrange’s technique, see Bussotti, “On the Genesis of the Lagrange 
Multipliers,” J Optimization Theory & Applications, Vol 117, No 3, pp 453-459, June 2003. 
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The points x we will consider as candidates for optimal are 


Definition. Regular Point. 
x is a regular point iff the set {Vgi(x) : i = 1..m} is linearly independent. 


The main theorem for Lagrange’s technique is 


Theorem. Lagrange Multipliers. 
Let x be a point satisfying the Lagrange multiplier necessary conditions. Then 


a. If f is concave and each g; is linear, then x is an optimal solution of the 
maximization problem (4.2). 


b. If f is convex and each g; is linear, then x is an optimal solution of the 
minimization problem (4.2). 


Previously, we used the Hessian matrix to determine if a function was 
convex, concave, or neither. Also note that the theorem limits constraints to 
linear functions. 

What if the constraints are nonlinear? We can use the bordered Hessian 
in the sufficient conditions. For a bivariate Lagrangean function with one 
constraint 


L((a1,@2),A) = f( (£1, £2)) + A(g( (z1, £2)) — b), 
define the bordered Hessian matrix as 


0 gı 92 
BdH = |g fir—Agu fiz — àg12 
g2 far —Ag21 f22— Àg22 


The determinant of this bordered Hessian matrix is 


|BdH| = 9192: (f21 —Age1) +9291 (fi2—Agiz) 93° (fi Agi) gT (faz Ag22) 








The necessary condition for a maximum in the bivariate case with one con- 
straint is that the determinant of its bordered Hessian is positive when eval- 
uated at the critical point. 

The necessary condition for a minimum in the bivariate case with one con- 
straint is the determinant of its bordered Hessian is negative when evaluated 
at the critical point. 

If x is a regular point and g;(x) = 0 (all constraints are satisfied at x), 
then 

Mx = {y | Voi(x) ty = 0} 
defines a plane tangent to the feasible region at x. (The ‘*’ is the usual dot 
product.) 


Lemma. 
If x is regular, g;(x) = 0, and Vg;(x) * y = 0, then Vf(x)*y =0. 
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Note that the Lagrange Multiplier conditions are exactly the same for a 
minimization problem as for a maximization problem. This is the reason that 
these conditions alone are not sufficient conditions. Thus, a given solution can 
either be a maximum or a minimum. In order to determine whether the point 
found is a maximum, minimum, or saddle point we will use the Hessian. 

The value of the Lagrange multiplier itself has an important modeling 
interpretation. The multiplier A is the “shadow price” for scarce resources. 
Thus, A; is the shadow price of the ith constraint. If the right-hand side of the 
constraint is increased by a small amount, say A, then the optimal solution 
will change by A; A. We will illustrate shadow prices both graphically and 
computationally. 


Graphical Interpretation of Lagrange Multipliers 


We can best understand the method of Lagrange multipliers by studying its 
geometric interpretation. This geometric interpretation involves the gradients 
of both the function and the constraints. Initially, consider only one constraint, 
g(x) = b, then the Lagrangian equation simplifies to 


Vf = AVg. 


The solution is the point x where the gradient vector Vg(x) is perpendicular to 
contours on the surface. The gradient vector V f always points in the direction 
f increases fastest. At either a maximum or a minimum, this direction must 
be perpendicular to contour lines on f’s surface S. Thus, since both Vf and 
Vg point along the same perpendicular line, then Vf = AVg. Further, g’s 
curve must be tangent to f’s contours at optimal points. See Figure 4.14. 
The geometrical arguments are similar for the case of multiple constraints. 


3 








(a) Contours and Constraint (b) Surface and Constraint 


FIGURE 4.14: Lagrange Multipliers Geometrically: One Equality Constraint 
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Let’s preview a graphical solution to an example. 


Maximize z = —2a? — 2y? 





+ vy + 8x + 3y 
subject to 
3x +y=6 


Generate a contour plot of z = f(x) with Maple, and overlay the single con- 
straint onto the contour plot. See Figure 4.15. What information can we obtain 
from this graphical representation? First, we note that the unconstrained opti- 
mum does not lie on the constraint. We estimate the unconstrained maximum 
as (x*,y*) = (2.3,1.3). The optimal constrained solution lies at the point 
where the constraint is tangent to a contour of z = f(x). This point is approxi- 
mately (1.8, 1.0) on the graph. We see clearly that the constraint does not pass 
through the unconstrained maximum, and thus, it can be modified/adjusted 
(if feasible) until the line passes through the unconstrained solution. At that 
point, we would no longer add (or subtract) any more constrained resources 
(see Figure 4.16). Valuable insights about the problem come from plotting the 
information, when possible. 


f XL 






N 


FIGURE 4.15: Contour Plot of f with Linear Constraint g 


Interpreting the value of À as a “shadow price” leads us to consider what 
happens when the amount of a resource governed by a constraint is changed. 
Figure 4.16 shows this concept graphically. 
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FIGURE 4.16: Resource Constraint g is Increased by Constant a 


Let’s now turn to the calculations. 


Lagrange Multipliers Computations with Maple 


The set of equations in (4.2), pg. 163, gives a system of n + m equations in 
the n+ m unknowns {x;,;}. Generally speaking, this system presents a very 
difficult problem to solve without a computer except for simple problems. Also, 
since the Lagrange Multipliers are necessary conditions only, not sufficient, we 
may find solutions (x;, Àj) that are not optimal for our specific NLP. We need 
to be able to classify the points found in the solution. Commonly used methods 
for determining optimality include 


a. the definiteness of the Hessian matrix 
b. the definiteness of the bordered Hessian via det(BdH) 
We'll illustrate these, when feasible, in the following examples with Maple. 


Example 4.19. Lagrange Multipliers with Maple. 
2. 





Maximize z = —2x? — 2y? + ry + 8x + 3y 
subject to 
3x +y=6 


First, define the f and g, and then display a plot. 





[> f := (x,y) > —2x? — 2y? + cy + 8x + 3y : 
g:=t—>6-3t: 
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For the plot, we’ll embed the contour plot into a 3D plot to make a better 
visual representation of f. To embed the plot, use the transform command 
from the plots package. 


| > T := transform((z, y) > [x, y, —500]) : 





Then T will take a point (x,y) on a 2D contour plot and embed it in a 3D 
plot on the plane z = —500. 


[> rng := (x = —10..10, y = —10..10) : 
Surface := plot3d(f (x, y), rng, style = patchcontour, contours = 15) : 
Contours := contourplot( f(x,y), rng, contours = 15): 


Now, put the graphs together. 
| > display( Surface, T(Contours)); 








Rotate the 3D figure and inspect it from many different perspectives. Being 
able to rotate a 3D image is an incredibly useful feature. 
Back to the computations. 
The Lagrangian function is L(x, y, A) = f(x,y) + A(g(a, y) — b). 
[> L:= (x,y, A) > f(z, y) + ABe+ y — 6); 
Calculate the system of necessary conditions. 


[> NecessCond := { diff (L(x, y, A), x) = 0, diff (L(x, y, A), y) = 0, 


diff (L(x, y, à), A) = 0}; 
| NecessCond := {—4e+y+84+3A =0,2—-4y+3+A=0,30+y—6 =0} 


Solve the system NecessCond to find potential optimal points. 
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| > opt := fsolve(NecessCond, {X, x, y}); 

L opt := {A = —0.7608695652, « = 1.673913043, y = 0.9782608696 } 
[> subs(opt, L(x, y, A)), subs(opt, f(x, y)); 

L 10.44565217, 10.44565217 

Explain why these two values are equal! 

It will be no surprise that Maple has several commands for finding solu- 
tions to a constrained optimization problem; choose which command to use 
depending on the form of the output needed. Table 4.5 shows the most com- 
monly used commands. 





TABLE 4.5: Maple Commands for Constrained Optimization 


Package Command 





Student|MultivariateCalculus] LagrangeMultipliers 
Basic Form: LagrangeMultipliers( ObjectiveFen, [constraints], [vars], options) 


Options: return more detailed report and/or plots 





Optimization NLPSolve 
Basic Form: NL PSolve( ObjectiveF'cn, [constraints], options) 


Options: set solving method, starting point, detailed output, etc. 





Optimization Maximize/Minimize 
Basic Form: Maximize( ObjectiveF cn, {constraints], options) 
Minimize( ObjectiveFcn, |constraints], options) 


Options: recommended: set a starting point 


Remember, in “real world problems,” a critical element of the solution 
using the Lagrange multiplier method is the value and interpretation of the 
multiplier À. 


Using LagrangeMultipliers 


| > with(Student|MultivariateCalculus}) : 


[> Obj = f(x,y) 
E Obj := —2x? + xy — 2y? + 8a + 3y 
[> Cnstr := 3z +y- 6: 
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| > LagrangeMultipliers( Obj, [Cnstr], |x, y], output = detailed); 
77 45 35 
i = WT wT 
The plot output can be very useful. 


Qn? + zy — 2y? + 87 + 3y = 





T 


| > LagrangeMultipliers( Obj, [Cnstr], |x, y], output = plot); 





The intersection of the surface 
f(x, y) = -2 +x y—2 y + 8 x + 3 y and one or more 
planes of the form [10.44565217] = constant. 





Using NIPSolve 


Constraints must now be written as equations (or inequalities). 
| > with(Optimization) : 


| > NLPSolve( Obj, {3a + y = 6}, maximize); 
(10.4456521739130430, [a = 1.67391304347826, y = 0.978260869565217]] 


We’ve lost the value of A since NLPSolve used a different method. Trying to 
use the Lagrangian equation for the objective function doesn’t capture the 
correct value. 


| > NLPSolve(Obj + A- (3a + y — 6), {3x + y = 6}, maximize); 


[10.4456521739130430, [A = 1.00000000000000, x = 1.67391304347826, y = 
| 0.978260869565217]] 








Using Maximize 


| > with(Optimization) : 
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| > Mazimize( Obj, {3a + y = 63); 
| [10.4456521739130430, [x = 1.67391304347826, y = 0.978260869565217]| 
Still no A. 
| > Maximize(Obj + A - (3a + y — 6), {3a + y = 6}, maximize); 
Warning, problem appears to be unbounded 
(10.4456521739130, [A = 0., £ = 1.67391304347826, y = 0.978260869565217]] 








And we did not capture the correct value of A again. 
If we need complete information, it’s best to use either direct computation 
of the method or the LagrangeMultipliers function to solve these problems. 
Combine the information we’ve found. The solution is 


f(x") = 10.4457 with x* = (1.6739,0.9783) and = 0.76087 


We have a solution, but we need to know whether this solution represents a 
maximum or a minimum. 

We can use either the Hessian or the bordered Hessian to justify that we 
have found the correct solution to our constrained optimization problem. 


|> with(Student| VectorCalculus]) : 
| with(LinearAlgebra) : 
|> H := Hessian( f(x,y), [x, y]); 
—4 1 
i ten 2 
[> IsDefinite( H, query = ’negative_ definite’); 
true 





The Hessian is negative definite for all values of x, so the regular point, also 
called the stationary point, x is a maximum. 
The bordered Hessian gives the same result. 


|> BdH := Hessian( f(x,y) +A- (3x + y — 6), [A, x, yl); 
1 
1 





0 3 
BdH := |3 —4 
1 1 —4 
[> |BdH| 
L 46 
Since the determinant is positive we have found the maximum at the critical 
point. 


Either method works in this example to determine that we have found the 
maximum for our constrained optimization problem. 

Now, let’s interpret the shadow price \ = 0.76. If the right-hand side of the 
constraint is increased by a small amount A, then the function will increase 
by approximately A- A = 0.76- A. Since this is a maximization problem, we 
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would add to the restricting resource if possible because it would improve the 
value of the objective function. 

From a graph, it can be seen that the incremental change must be small or 
the objective function could begin to decrease. Look back at Figure 4.16. If we 
increase the right side of g(a) = b by one unit so the constraint becomes 3a + 
y = 7, the solution at the new point (x**,y**) should yield a new maximum 
functional value f(2**,y**) approximately equal to the old maximum plus 
À times the change A; that is f(x**,y**) = f(a*,y*) + AA, here 10.4457 + 
0.7609 = 11.2065. In actuality, changing the constraints yields a new solution 
of 11.04347826. (Verify this!) The actual increase is about 0.5978. 





Example 4.20. Lagrange Multipliers with Multiple Constraints. 
Minimize w = x? + y? + 3z subject to x +y = 3 and z + 3y + 2z = 7. 


First, we directly solve the problem using Maple. 
| > with(Student|MultivariateCalculus]) : 
[> f:=(2,y,z) > 2? +y? +32: 


|> Constraints := (3 — (x + y), 7 — (x + 3y + 22)) : 





Now, define the Lagrangian function L and find its gradient. 

[> L:= (£,y,z, À, u) > f(x,y, z) + à- Constraints, + u- Constraints, : 
[> grad := Gradient(L(x, y, z, A, p), [£, y, z, A, #]); 

2a+A+p 

2y+At 3u 

grad := 3+ 2u 

ety—3 

[a+ 3y+2z—7 








Build the system of equations and solve it to find the potential solution. 


| > LMsystem := [seq(G = 0,G in grad)|; 
LMsystem := [2z ++ u= 0,2y +à +34 =0,3 +24 =0,£ +y-3= 
0,a+ 3y+2z-7=0] 

| > fsolve(LMsystem) : 

fnormal(%, 4); 

subs(%, f(x,y, Z)); 

{A = —0., p = —1.500, x = 0.7500, y = 2.250, z = —0.2500} 

L 4.87500000 
Check the solution! (We’ll use the “long-form,” that is “full name,” of the 
commands so as to not load the packages.) 
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| > Student :- VectorCalculus :- Hessian( f(x,y, z), (2, y, 2]); 
Linear Algebra :-IsDefinite(%, query = ’positive_semidefinite’); 


2 0 0 

0 2 0 

0 0 O 
true 





This Hessian is always positive semi-definite. The function is convex, and so 
our critical point is a minimum. 
Now, we’ll solve the problem using Maple’ LagrangeMultipliers function. 


| > with(Student|MultivariateCalculus}) : 


| > LagrangeMultipliers( f (x, y, z), [Constraints], |x, y, z], output = 
detailed); 


3 9 Í 3 39 
f= 79S pe gr = 0,2 = 58? $y? +32 = = 








The same Hessian shows the answer is our desired minimum. 

Interpret the shadow prices we found, \ = 0 and u = —3/2. If we only had 
the funds to increase one of the two constraint resources, which one should 
we choose? Since the shadow price of the first constraint is 0, we expect no 
improvement in the objective function’s value; we would not spend extra funds 
to increase the first resource. Since the shadow price of the second constraint is 
—1.5, we expect to change the objective function’s value by —1.5A (Improving 
the objective since we are minimizing!) if we increase that resource by A; we 
would spend extra funds to increase the second resource as long as the cost 
of increasing resource 2 was less than 1.54. Why does the cost need to be less 
than 1.5A? 


Applications using Lagrange Multipliers 


Many applied constrained optimization problems can be solved with Lagrange 
multipliers. We’ll start by revisiting the opening problem of this section (from 
page 162). 


Example 4.21. The Cobb-Douglas Function. 

A company manufactures new phones that are projected to take the market by 
storm. The two main input components of the new phone are the circuit board 
and the LCD Touchscreen that make the phone faster, smarter, and easier to 
use. The number of phones to be produced P is estimated to equal P = 
250 a'/4b!/? where a and b are the number of circuit board production hours 
and the number of LCD Touchscreen production hours available, respectively. 
This type of formula is known to economists as a Cobb-Douglas production 
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function’. Laborers are paid by the type of work they do: the circuit board 
labor cost is $5 an hour and the LCD Touchscreen labor cost is $8 an hour. 
What is the maximum number of phones that can be made if the company 
has $175,000 allocated for these components in the short run? 


Let’s use Maple. 
[> CD := (a,b) + 250- a™t5 . 525 ; 
L Constraint := 5a + 8b — 175000 : 
| > L := (a,b, A) > CD(a,b) + à- Constraint : 
|> grad := Gradient(L(a, b, A), [a, b, AJ); 
[ 62.50 a7075p0-50 4.5 A 
grad := | 125.0.a%-755-9- + 8 A 
L | 5a+8b-— 175000 
[> LMsystem := [seq(G = 0, Gingrad)]; 








[> soln := fsolve(LMsystem, a,b, A); 

E soln := {a = 11666.66667, b = 14583.33333, \ = —1.344709033} 
> subs(soln, L(a, b, X)); 

313765.4410 


Of course, LagrangeMultipliers gives the same results. 


| > LagrangeMultipliers(CD(a, b), [Constraint], [a,b], output = detailed) : 
fnormal(%, 6); 
[a = 11666.7, b = 14583.3, \) = 1.34471, 250a°-?> Vb = 313765. 
We find we can make ~ 313,765 phones using 11,667.67 production hours for 
circuit boards and 14,583.33 production hours for LCD touchscreens. 
A plot of the Cobb-Douglas function centered at the (11,668, 14,583) show- 
ing the constraint in Figure 4.17 finishes our analysis. 








“Originated in Cobb & Douglas, “A Theory of Production,” Am Econ Review, 18, 1928, 
pg. 139-165. 
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FIGURE 4.17: Cobb-Douglas Phone Production Function 


Example 4.22. Oil Transfer Company Storage Facilities. 

The management of a small oil transfer company desires a minimum cost 
policy taking into account the restrictions on available tank storage space. A 
formula has been derived from historical data records that describes system 
costs. 








where 


an is the fixed costs for the nth item, 
bn is the withdrawal rate per unit time for the nth item, 
hn is the holding costs per unit time for the nth item. 


The tank space constraint is given by: 


N 
g(x) = 5 tatn =T 
n=1 


where tn is the space required for the nth item (in 1,000s bbl) and T is the 
available tank space (in 1,000s bbl). The parameter values shown in Table 4.6 
are based on the company’s data collected over several years. 
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TABLE 4.6: Oil Storage data Parameter Estimates 





EN (81,0005) (k E MER aoi bbl) 
1 9.6 3 0.47 1.4 
2 4.27 5 0.26 2.62 
3 6.42 4 0.61 1.71 


The storage tanks have only 22 (1,000s bbl) space available. Find the 
optimal solution as a minimum cost policy. 

We’ll first solve the unconstrained problem. This solution will provide an 
upper bound for the constrained solution and help us gain insight into the 
dynamics of the problem. 

Define the parameters, then define the functions. 





[> N:=3: 
a := [9.6, 4.27, 6.42] : 
b := [3,5,4]: 
h := (0.47, 0.26, 0.61] : 
t := [1.4, 2.62, 1.71] : 
> fi= a Me H MZe n= 1.) 
Ën 2 
| g:= 2S SsUM(tn n,n = LN): 


| > ‘f'(x) = f(x); 
‘g' (2) = g(x); 


28.8 21.35 25.68 
f(z) = oa + 0.23500 xı + —— + 0.13000 x2 + —— + 0.30500 x3 
T2 T3 





g(x) = 1.4 x1 + 2.62 £2 + 1.71 z3 


Build the system of equations {Of /Ox; = 0} and solve it. (Remember to load 
Student|MultivariateCalculus] for Gradient.) 


| > grad := Gradient( f(x), [x1, £2, x3]); 











| 283 + 0.23500 
a 
21.35 
grad := | -— 3— + 0.13000 
wg 
— 25,08 + 0.30500 
x3 





> sys := [seq(G = 0, G in grad)]: 
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| > solns := solve(sys, [r1, £2, £3]) : 
Matriz(solns); 
xı = 11.07037450 z2 = 12.81525533 x3 = 9.175877141 


zı = 11.07037450 = x2 = —12.81525533 z3 = 9.175877141 
zı = 11.07037450 z2 = 12.81525533 x3 = —9.175877141 
xı = 11.07037450 x£ = —12.81525533 x3 = —9.175877141 
zı = —11.07037450 = x2 = 12.815255383 x3 = 9.175877141 
x, = —11.07037450 x2 = —12.81525533 x3 = 9.175877141 

zı = —11.07037450 = xg = 12.81525533 z3 = —9.175877141 
L | zı = —11.07037450 x2 = —12.81525533 z3 = —9.175877141 


The only useful solution is where x1, 22,23 > 0, so we choose the first row. 





[> Soln := solns,; 
Soln := [xı = 11.07037450, z2 = 12.81525533, x3 = 9.175877141] 


This solution x* = (11.07, 12.82, 9.18) provides an unconstrained upper bound 
since it does not satisfy the constraint g(x) = 22. 





| > g(rhs~ (Soln)) = 22; 





64.76524317 = 22 


Now we solve the constrained optimization problem knowing that the solu- 
tion will be less than the unconstrained value we just found. 


[> L:= (x, A) > f(x) + à- (g(x) — 22) : 


| > Lgrad := Gradient(L(z, A), [£1, £2, £3, A) : 

Lsys := [seq(G =0,G in Lgrad)] : 
Using solve as we did above gives a plethora of complex and negative values 
that we don’t want. We’ll use fsolve with a starting value of the unconstrained 
solution and A = 1. 


|> StartingPt := {op(Soln),A\ = 1} : 
Lsolns := fnormal(fsolve(Lsys, StartingPt), 4); 
Lsolns := { A = 0.7397, 1 = 4.761, £2 = 3.213, £3 = 4.045} 








Do we have the minimum? 
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The Hessian matrix H (Remember to load Student[VectorCalculus/) is 


[> H := Hessian(f(x),[x1, £2, x3]); 








57.6 
[3 42.7 i 
H= 0 — 4 
51.36 
0 a 
[> |H]; 
126320.9472 
ETETE 


Since the Hessian is positive for all positive x, the matrix is positive def- 
inite. We have found a minimum for the constrained problem at x* = 
(4.761, 3.213, 4.045). 

Should we recommend the company add storage space? We know from 
the unconstrained solution that, if possible, we would add storage space to 
decrease the costs. We have found the shadow price A = 0.74 which suggests 
that any small increase A in the RHS of the constraint causes the objective 
function to decrease by approximately 0.74- A. The cost of the extra storage 
tank would have to be less than the savings incurred by adding the tank. 








Exercises 
1. Solve the following constrained problems using the Lagrangian approach. 


(a) Minimize z = x? + y? subject to x + 2y = 4. 
(b) Maximize z = (x — 3)? + (y — 2)? subject to x + 2y = 4. 


2 + dry + y? subject to z? +y? =1. 





(c) Maximize z = x 


(d) Maximize z = x? + 4ry + y? subject to z? + y? = 4 and z + 2y = 4. 











2. Find and classify the extrema for f(x,y,z) = 2? + y? + 2? subject to 
x? +2? -2z =1. 
3. Two manufacturing processes, fı and f2, both use a resource with b units 
available. Maximize fı(xı) + fo(a2) subject to zı + z2 = b. 
If f(1(@1) = 50 — (zı — 2)? and fo(x2) = 50 — (x2 — 2)?, analyze the 
manufacturing processes to 
(a) determine the amount of zı and x2 to use to maximize fı + fo, 
(b) determine the amount b of the resource to use. 


4. Maximize Z = —2x? — y? + xy + 8x + 3y subject to 3x + y = 10 and 
x? +y? = 16. 
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5. Use the Method of Lagrange Multipliers to maximize f(x,y,z) = xyz 
subject to 2x + 3y + 4w = 36. 
Determine how much f(x,y,z) would change if one more unit was added 
to the constraint. 








4.5 Constrained Optimization: Kuhn-Tucker Conditions 
Inequality Constraints and the Kuhn-Tucker Conditions 


Previously, we investigated procedures for solving problems with equality con- 
straints. However, in most realistic problems, many constraints are expressed 
as inequalities. The inequality constraints form the boundaries of a set con- 
taining the solution. 

One method for solving NLPs with inequality constraints is by using the 
Kuhn-Tucker Conditions (KTC) for optimality, sometimes called the Karush- 
Kuhn-Tucker conditions.® In this section, we’ll describe this set of conditions 
first graphically, then analytically. We’ll discuss both necessary and sufficient 
conditions for x = (#1, £2,..., Zn) to be an optimal solution to an NLP with 
inequality constraints. We’ll also illustrate how to use Maple to compute solu- 
tions. Last, we’ll close with example applications. 


Basic Theory of Constrained Optimization 
The generic form of the NLPs we will study in this section is 
Maximize (Minimize) z = f(x) 


subject to (4.4) 


gi(X) bi, fori =1,2,...,m 


IV ILIA 


(Note: Since a = b is equivalent to (a < bA a> b) and a > b is equivalent to 
—a < —b, we could focus only on less-than inequalities; however, the technique 
is more easily understood by allowing all three forms.) 

Recall that the optimal solution to an NLP with only equality constraints 
had to fall on one constraint or at an intersection of several constraints. With 
inequality constraints, the solution no longer must lie on a constraint or at 
an intersection point of constraints. We need a method that describes the 
position of the optimal solution relative to each constraint. 





8First formulated in Kuhn & Tucker, “Nonlinear programming,” Proc 2nd Berkeley Sym, 
U Cal Press, 1951, pp. 481-492. 
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The technique based on the Kuhn-Tucker conditions involves defining a 
Lagrangian function of the decision variables x, the Lagrange multipliers A;, 
and the nonnegative slack or surplus variables u2. The nonnegative slack vari- 
able u? is a variable added to the ith ‘less-than or equal’ constraint to trans- 
form it to an equality: gi(x) < bi > gi(x) + u? = b;; the nonnegative variable 
p; “picks up the slack” in the inequality. The surplus variable u? is a vari- 
able subtracted from the jth ‘greater-than or equal’ constraint to make it an 
equality: g;(x) > bj + gj(x) — u4 = bj; the variable u? “holds the surplus” in 
the inequality. In this formulation, the shadow price for the ith constraint is 
—r. 

The Lagrangian function for our generic NLP (4.4) is 


m 


L(x, A, u) = f(x) + X Ai (g) E HF — bi) (4.5) 


i=l 





Remember, the sign with u; is + for < constraints and — for > constraints. 

Analogously to the method Lagrange multipliers, the computational pro- 
cedure based on the KTC requires that all the partials of the Lagrangian 
function (4.5) equal zero. All these partials being equal to zero forms the 
necessary conditions for the solution of (4.4) to exist. 


Theorem. Necessary Conditions for an Optimal Solution. 
If x* is an optimum for the NLP (4.4), then 








OE ied = Bien Hi (4.6a) 
Ox; 

L 
a = 0 for i = 1,2,..., m (4.6b) 
oL = 2u;à; = 0 for i = 1,2,..., m (4.6c) 
Oki 


Condition (4.6c) is called the complementary slackness condition. 
The following theorem provides sufficient conditions for x* to be an opti- 
mal solution to the NLP given in (4.4). 


Theorem. Sufficient Conditions for an Optimal Solution. 
Suppose each g;(x) is a convex function. 


Maximum: If f(x) is concave, then any point x* that satisfies the necessary 
conditions is a maximal solution to (4.4). Further, each A; < 0. 


Minimum: If f(x) is convex, then any point x* that satisfies the necessary 
conditions is a minimal solution to (4.4). Further, each A; > 0. 


If the necessary conditions are satisfied, but the sufficient conditions are 
not completely satisfied, then we may use a bordered Hessian matrix to check 
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the nature of a potential stationary or regular point. The bordered Hessian 
can be written in general as the block matrix 


0 | Vg 
Se ee SE ee et ee ee 
BdH := | @L 
T 
V9) Ox;OX; 


We can classify, if possible, the stationary points as maxima or minima accord- 
ing to the bordered Hessian’s definiteness. If the bordered Hessian is indefinite, 
then a different classification method must be used. 


The Complementary Slackness Condition 


The KTC computational solution process solves the 2” possible cases for À; 
and ui, where m equals the number of constraints, then applies the necessary 
conditions to find optimal points. The 2 comes from the number of possibilities 
for each A;: either A; = 0 or A; #4 0. There is actually more to this process: it 
really involves the complementary slackness condition imbedded in the neces- 
sary condition (4.6c), 2uià; = 0. If u; equals zero, then A;, the shadow price, 
can be nonzero and the ith constraint is binding—the optimal point lies on the 
constraint boundary. If u; is not equal to zero, then A;, the shadow price, must 
be zero and the ith constraint is nonbinding—there is slack (< constraint) or 
surplus (> constraint), represented by u;. Ensuring the complementary slack- 
ness conditions are satisfied reduces the work involved in solving the other 
necessary conditions from Equations (4.6a) and (4.6b). 

Based on this analysis, the complementary slackness necessary conditions 
(4.6c) lead to the solution process that we focus on for our computational and 
geometric interpretation. We have defined i? as a slack or surplus variable. 
Therefore, if 4? equals zero, then our optimal point lies on the ith constraint, 
and if u? is greater than zero, the optimal point is interior to the ith constraint 
boundary. However, if the value of u; is undefined because pu? equals a negative 
number, then the point of concern is infeasible. Figure 4.18 illustrates these 
conditions. 
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T 









A ao 
(a) u? = 0: On Constraint (b) u? > 0: Inside (c) u? < 0: Infeasible 


FIGURE 4.18: Complementary Slackness Geometrically 


Computational KTC with Maple 
First, we’ll step through a solution using Maple for the computations. 


Example 4.23. Two Variable-Two Constraint Linear Problem. 


Maximize z = 3x + 2y 


subject to 
2x +y < 100 
x+y < 80 





Define the generalized Lagrangian (4.5) for this problem. 


L(x, A, p) = (3x + 2y) + à (Qa + y + u? — 100) + Ag (a + y + u2 — 80) 





The six Necessary Conditions are: 











1. OL/dx =0 = > 3+2 +å2=0 (4.7a) 
2. OL/Oy =0 = 2+ +å2=0 (4.7b) 
3. OL/OA\,=0 => 22+y+ u? — 100 = 0 (4.7c) 
4. OL/0d2=0 = x+y+pu2—-80=0 (4.7d) 
5. OL/Om, =0 = 241 = 0 (4.7e) 
6. OL/Op2=0 => 2p2dA(2 = 0 (4.7f) 


Recognize that, since there are two constraints, there are four (27) cases 
required to solve for the optimal solution. These 4 cases stem from necessary 
conditions 244;A; = 0 and 2u2A2 = 0, the complementary slackness conditions. 

The cases are collected in Table 4.7. 
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TABLE 4.7: The Four Cases for Complementary Slackness 


Case Condition Imposed Condition Inferred 





I. à = 0, A2 = 0 u? #0, 3 #0 
II. Ay = 0, ` #0 u? #0, ne =O 
Ill. Ay #0, A2 = 0 pe =U, 20 
IV. AL #0, Ao #0 pw? = 0, p2 =0 


For simplicity, we have arbitrarily made both x and y > 0 for this maxi- 
mization problem. Figure 4.19 shows a graphical representation. 





FIGURE 4.19: The Region of Feasible Solutions 


Returning to Cases LIV, we observe that: 


CASE I. There is slack in both the first and second constraints as both 
uu? > 0. Therefore, we do not fall exactly on either of the constraint bound- 
aries. This corresponds to the intersection point labeled P, at (0,0), since 
only intersection points can lead to linear optimization solutions. This point 
is feasible, but is clearly not optimal—moving away from (0,0) in the first 
quadrant increases the objective function. This case will not yield an optimal 
solution. 
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CASE II. The possible solution point is on the second constraint, but not on 
the first constraint. There are two possibilities, P and P;, from Figure 4.19. 
Point P; is infeasible. Point P is feasible, but not an optimal solution. This 
case will not yield an optimal solution. 


CASE III. The possible solution point is on the first constraint, but not 
on the second constraint. There are two possible solutions, P> and P,, from 
Figure 4.19. Point P; is infeasible. Point P2 is feasible, but not optimal. Again, 
this case does not yield an optimal solution. 


CASE IV. The possible solution point lies on both constraint 1 and con- 
straint 2 simultaneously. This corresponds to Pg in Figure 4.19. Point Ps 
is the optimal solution. It is the point of the feasible region furthest in the 
direction of increased value of the objective function. This case will yield the 
optimal solution to the problem. 


Sensitivity analysis is also enhanced by a geometric approach. Figure 4.19 
shows that increasing the right-hand side of either or both constraints will 
extend the feasible region in the direction of the objective function’s increase, 
thus increasing the value of the objective function. We can also see this through 
the computational process and the solution’s values of the \;. Computational 
sensitivity analysis can be derived with the value of the shadow price —A;. 

Since the objective function and constraints are linear, convex, and concave 
functions, the sufficient conditions are also satisfied. 

The following computational analysis will show that Case IV yields the 
optimal solution, confirming the graphical solution. 


CASEI. à = Ag = 0. This case violates Equations (4.7a), 2 Æ 0, and 
(4.7b), 3 40. This case also implies u? 4 0 with slack in both inequalities. 


CASE II. Ay = 0, Az Æ 0. This case violates either Equation (4.7a), A2 = 
—3, or (4.7b), dg = —2. 


CASE III. Ay 4 0, Az = 0. This case similarly violates either Equation (4.7a), 
Ay = —3/2, or (4.7b), Ay = —2. 


CASE IV. àı #0, Ag Æ 0. This case implies that u? = u2 = 0 which reduces 
(4.7) to the two sets 


{3+2A;+A2 =0, 2+1 +A, =0} 
and 
{2x =y-100=0, «+y=80=0} 
Solving these sets simultaneously yields the optimal solution x* = 20, y* = 60 


giving the maximum f(2*,y*) = 180. We also have \y = —1, A2 = —1, u? = 
u2 = 0. The shadow price indicates for a small change A in the right-hand side 
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value of either Constraint 1 or Constraint 2, the objective value will increase 
by approximately —A. A = A. The geometric interpretation reinforces the 
computational results, giving them meaning and fully showing the effect of 
binding constraints (constraints where u? = 0) on the solution. 

The following is a short Maple session of the computations above. 


| > with(Student|MultivariateCalculus}) : 

[> f:= (x,y) > 3x + 2y : 

| Cnstrnt := [2x + y — 100, x + y — 80] : 

[> L:= (x,y, à, p) > f(x, y)+A1:(Cnstrntı+u?)+A2: (Cnstrnt[2]+ u2) : 





|> grad := Gradient(L(z, y, A, p), |£, y, A1, A2, H1, Mal) : 
NecCond := |seq(G = 0, G in grad)] : 

(NecCond); 

3+2A1+ A2 =0 


2+A1+A2 =0 

u? +2e+y—100=0 

u2 +2 +y—80=0 
2A, p1 = 0 


L L 2 A2 H2 =0 
| > soln := solve(NecCond, |x, y, 1, A2, H1, Hel); 
L soln := |[x = 20, y = 60, Ay = —1,A2 = —1, wr = 0, po = O]] 
[> f(20, 60); 
subs(a = 20, y = 60, Constraint); 
180 
L [0, 0] 


How do we know we’ve found a maximum? Recall the rules for finding the 
maximum or minimum. 

MAXIMUM: If f(x) is a concave function and each constraint g;(x) is a convex 
function, then any point that satisfies the necessary conditions is an optimal 
solution that maximizes the function subject to the constraints and has each 
A; < 0. 

MINIMUM: If f(x) is a convex function and each constraint g;(x) is a convex 
function, then any point that satisfies the necessary conditions is an optimal 
solution that minimizes the function subject to the constraints and has each 
à; = 0. 

The objective function in linear, and so is both convex and concave, as are 
the constraints. Since the values of A; are both negative, we have found the 
maximum. 

We can use Maple’s own LagrangeMultipliers. 














| > LagrangeMultipliers( f (x,y), Cnstrnt, |x, y], output = detailed); 
L [x = 20, y = 60, ài = 1, å2 = 1,3- £ +2- y = 180] 
Note the sign difference on Maple’s As. Explain how this occurs. 
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In the next example, we add one constraint, x < 40, to the previous prob- 
lem. Adding one constraint causes the number of solution cases we must con- 
sider to grow from 2? to 2? or doubling to 8 cases—each additional constraint 
doubles the number of cases. The new problem with three constraints is shown 
in Figure 4.20. Again, for simplicity, we arbitrarily force both x and y > 0. 


Example 4.24. Two Variable, Three Constraint Linear Problem. 


Maximize z = 3x + 2y 





subject to 
2x +y < 100 
x+y <80 
x <40 


The new Lagrangian is 





L(x, A, p) = (3a4+2y)+Ai (2x+y+u?—100)+A2(£1+y+u3—80)+A3(£+u3—40) 


Add the new constraint to the graph. 





FIGURE 4.20: Contours of f(x) with Three Constraints 


A summary of the graphical interpretation is displayed in Table 4.8. The 
optimal solution is found using Case IV. Again, the computational solution 
merely looks for the point where either all the necessary conditions are met or 
are not violated. The geometric interpretation reinforces why the cases other 
than IV do not yield an optimal solution. 
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TABLE 4.8: The Eight Cases 


Case Condition Imposed Point Feasible Optimal 
























































I. Ài 0, A2 0, A3 0 Py yes no 
(all constraints have slack) 
II. Ay #0, Az = 0, å = 0 P,P} no/no no/no 
(on constraint 1, not 2 or 3) 
II. Ài 0, A2 0, A3 0 P3, Ps yes/no no/no 
(on constraint 2, not 1 or 3) 
IV. Mt 0, A2 0, A3 0 P; yes no 
(on constraint 3, not 1 or 2) 
V. Ài 0, A2 0, A3 0 Ps yes yes 
(on constraints 1 & 2, not 3) 
VI. Ài 0, A2 0, A3 0 Pg yes no 
(on constraints 1 & 3, not 2) 
VII. Ài = 0, A2 0, A3 F 0 Po no no 
(on constraints 2 & 3, not 1) 
VIII. A1 £0, Ax £0, Az £0 = = = 


(on constraint 1, 2, and 3) 


The optimal solution will be found only in Case V which geometrically 
shows that the solution is binding on Constraints 1 and 2, and not binding on 
Constraint 3 (slack still exists). The optimal solution found computationally 
using Case IV (as done in the previous example) is 


f(a*,y*) = f(20,60) = 180, 


the same as before. Constraint 3 did not alter the solution as it is nonbinding; 
i.e., has slack in the solution. The “detailed solution” adds \3 and u3. 














[x 20,y 60, Ai —1,r2 m —1, A3 0, ui 0, u3 0, u3 20] 


The geometric interpretation takes the mystery out of the case-wise solu- 
tions. We can visually see why in each specific case we can achieve or not 
achieve optimality conditions. Whenever possible, make a quick graph, and 
analyze the graph to eliminate as many cases as possible prior to doing the 
computational solution procedures. Let’s apply this procedure to another 
example. 


188 Multivariable Optimization 


Example 4.25. Geometric Constrained Nonlinear Optimization 
Problem. 


Maximize z = (a — 14)?+(y — 11)? 
subject to 
(x — 11)? + (y — 18)? < 49 
x+y <19 


Use Maple to generate contour plots overlaid with the constraints to obtain 
the geometrical interpretation shown in the worksheet below. The optimal 
solution, as visually shown, is the point where the level curve of the objec- 
tive function is tangent to the constraint x + y = 19 in the direction of 
increase for the contours of f. The solution satisfies the other constraint 
(x — 11)? + (y — 13)? < 49, but there is slack in this constraint. The solu- 
tion corresponds to the case where Constraint 2 is binding and Constraint 1 is 
nonbinding. The constraints being nonbinding and binding, respectively, are 
shown computationally by 


(Ar = 0, uy # 0) and (Az # 0, u3 = 0). 
Finish by estimating the solution in the plot below. 
| > with(plots) : 


[> f= (ay) > (x — 14} + (y — 11): 

gı := (x,y) > (x — 11)? + (y — 13}: 

|. g2:= (x,y) ety: 

|> rng = (x = 2..21,y = 2..21) : 

fillopts := (filledregions = true, coloring = |wheat, white]) : 

[> fcp := contourplot( f(x,y), rng, contours = 40, color = grey); 

fcp19 := contourplot( f(x,y), rng, contours = [19], color = black); 

gicp := contourplot(gi(x, y), rng, contours = [49], color = red, 
thickness = 2, fillopts); 

g2cp := contourplot(gə(x, y), rng, contours = [19], thickness = 2) : 
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| > display(fep, fep19, g1cp, g2cp): 


N 
+ 
an 
(zae) 


10 12 14 16 18 20 





We use the fact that Constraint 1 is nonbinding and Constraint 2 is binding 
to directly solve this case to find the optimal solution. Graphically, we can 
obtain a good approximation, but we cannot obtain the shadow prices which 
are invaluable in sensitivity analysis. In this case, we saw that 


(Ar = 0, ni # 0) and (Az # 0, u3 = 0). 
The necessary conditions for this case are 


OL/Ox = > 24 -— 28 +2 = 0 

OL/Oy => 2y— 224+ A. =0 

OL/O\ (x — 11)? + (y — 13)? + p? — 49 = 0 
OL/O\2 => «ct+y—-19=0 











Let’s use Maple to find the optimal point and the shadow prices. 
We’ve defined f and gi above, so start by loading the Stu- 
dent| Multivariate Calculus] package to make Gradient available. 


[> with( Student|MultivariateCalculus] : 


[> L = (2,y,A, u) > f(x,y) + Ar(gi(a, y) + u7 — 49) + Aa(ga(z, y) 
| = +g — 19): 
[> grad = Gradient(L(x, Y, A, Lt), [x, Y, Àr, A2, Hı, H2]) : 
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We’ll add the case defining A; = 0 and u2 = 0 to the necessary conditions’ 
system of equations. 


[> CasewNecCond := [Ai = 0, u2 = 0, seq(G = 0,G in grad): 


? 


For solving this system, we’ll use some of Maple’s “cleverness:” using Real- 
Domain will ignore complex solutions to our system, and we’ll solve for pî, 
rather than p. 


| > soln := RealDomain:-solve(CasewNecCond, |x, y, A1, A2, 12, ual); 
L soln := [[z = 11,y = 8, àı = 0, Ao = 6, u? = 24, pre = OF] 
[> f(11,8); 
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We determine the optimal solution that satisfies the conditions is: x* = 11, 
y* = 8, Ay = 0, Ae = 6, u? = 24, and u2 = 0. The value of the objective 
function is f(x*,y*) = 18. 

Interpreting the shadow prices shows that if there are more resources for 
Constraint 2, our objective function will decrease. If we add A to the right- 
hand side of Constraint 2, the objective function value will decrease by approx- 
imately 6A. If we changed the right-hand side of the constraint from 19 to 20, 
the optimal solution becomes 2* = 11.5, y* = 8.5, and f(a*,y*) = 12.5 or a 
decrease of 5.5 units (verification is left as an exercise). 

It is also possible to use the LagrangeMultipliers function from the Stu- 
dent|MultivariateCalculus] package to get extensive information. To get the 
extra analysis from Maple: 














e write the constraints as they appear in the Lagrangian, 
e write the constants in the constraints as decimals, and 
e list A; and u; as variables. 


|> Soln2 := LagrangeMultipliers(f (x,y), [gi(a, y) + u2 — 49.0, 
g2(x,y) + u3 — 19.0], [x, y, A1, A2, H1, u2], output = detailed) : 





In order to make the output more readable, normalize the decimals to four 
figures and increase the size of matrix displayed. (The display here is smaller 
and reduced to three digits to fit in the text—execute the code to see the full 
solution display.) 


| > interface(rtablesize = 20) : 
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| > Matrix(fnormal([Soln2], 4)); 
2=140 y=11.0 4=00 A2=00 m=6.0 po=—2457 
fo y=11.0 à =0.0 àz=0.0 m=6.0  u2=245I 
£=14.0 y=11.0 à =0.0 2=00 p =-6.0 u= -245I 
gz=14.0 y=11.0 à =0.0 àz=0.0 m=-6.0 u=245I 
z=11.0 y=8.0 à =0.0 àz=—6.0 pı=—490  p=0.0 
e=110 y=80 ` =0.0 2=-6.0 pm=490  u2=0.0 








The first four lines displayed have u2 complex—we discard those, they are 
not feasible. The fifth line has 4; negative, this is also not feasible. The sixth 
line corresponds to the solution we’ve found. The remaining lines (not shown 
here) have 1 nonzero. Note, Maple’s As have an opposite signs to ours. 

We close this example by once again pointing out the value of the objective 
function f(x*,y*) = 18 is a minimum because f is convex, the g; are convex, 
and the A; are non-negative at (2*,y*). The shadow price for Constraint 2 is 
A2 = 6, and the slack in Constraint 1 is uw, = 4.9. 


Necessary and Sufficient Conditions for Computational KTC 


Visual interpretation from graphs can significantly reduce the amount of work 
required to solve the problem. Interpreting the plot can provide the conditions 
involved at the optimal point, then we can solve directly for that point. How- 
ever, often a graphical interpretation cannot be obtained, then we must rely 
on the computational method alone. When this occurs, we must solve all the 
cases and interpret the results. 

Let’s redo the previous example without any graphical analysis. 


Example 4.26. Computational Constrained Nonlinear Optimization. 
Maximize z = (a — 14)?+(y — 11)? 
subject to 
(x — 11)? + (y — 13)? < 49 
x+y <19 


The Lagrangian is 


L:= (x,y, A, u) > f(x,y) + ài (g(x,y) + HY — 49) + à2(g2(£, y) + u3 — 19) 
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Therefore, the necessary conditions are 











OL/Ox = > 2(a — 14) + 2åı(x — 11) + Aor = 0 (4.8a) 
OL/Oy => 2(y— 11) + 2A1(y — 13) + 2y = 0 (4.8b) 
OL/Oy => (z-— 11)? + (y — 13)? + u? —49=0 (4.8c) 
OL/OA. => cr+yt+p3—-19=0 (4.8d) 
OL/Op, => 2A1p = 0 (4.8e) 
OL/Op2 => 22A2p12 = 0 (4.8f) 


Define the functions in Maple, then investigate the cases for A; and Hi 
being zero or nonzero. We’ll use decimals for the constants to force floating- 
point arithmetic to simplify the calculations. Remember to load the Stu- 
dent| Multivariate Calculus] package to make the Gradient function available. 


[> f := (x,y) > (z — 14)? + (y — 11}: 
gı := (x,y) > (x — 11)? + (y — 13)? : 
| g2:=(z,y)>xr+y: 
|> L := (x,y, À, u) 2 f(x,y) +A1- (g(x, y) a Ly = 49.0) 
L +A2; (g2(£, y) + u3 — 19.0) : 
[> grad := Gradient(L(x,y, À, p), [£, y, A1, à2, 41, u2]) : 
NecessaryConditions := |seq(G = 0,G in grad)] : 











In each of the cases, we will solve for variables u? rather than ju; to reduce com- 
plexity. We’ll also use RealDomain:-solve to avoid complex solutions. Defining 
the list of variables as vars reduces typing and simplifies the command struc- 
ture making it more readable. 


[> vars := [z, y, A1, A2, H, p2] : 


CASE 1. à = 0 and Ag = 0. Then p? £0 and u2 £0. 


[> Casel := à = 0,42 = 0: 
subs(Casel , NecessaryConditions); 
sl := RealDomain:-solve(%, vars) : 
Matrix(subs(Case1, s1)); 
[2x — 28 = 0, 2y — 22 = 0, (x — 11)? + (y — 13)? + u2 — 49.0 = 0, 
a +y+ u2 — 19.0 = 0,0 = 0,0 = 0] 


[c=14. y=11. 0=0 0=0 pi =36. p= -—6.] 











This case is infeasible since 43 = —6 which indicates that Condition 2, 
g2(z,y) < 19, is violated. It would have been easy to solve this necessary 
conditions system by hand. The first two necessary conditions give x = 14 
and y = 11 directly, then substituting the values for x and y in the two con- 
straints easily gives „4? = 36 and u2 = —6. Using Maple, however, provided a 
template for solving all four cases. 
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CASE 2. 4; = 0 and àz £0. Then u? Æ 0 and uł = 0. 


[> Case? := 1 = 0,42 =0: 
subs(Case2, NecessaryConditions); 
s2 := RealDomain:-solve(%, vars) : 
Matriz(subs(Case2, s2)); 
[2x — 28 + Az = 0, 2y — 22 + Ap = 0, (x — 11)? + (y — 13)? + u2 — 49.0 = 0, 
x+y—19.0=0,0=0,0=0] 


[c=11. y=8. 0=0 A=6 pe=24. 0=0.] 

















All the necessary conditions are satisfied at the point (x*, y*) = (11,8) where 
Ai = 0, A2 = 6, u? = 24, and p2 = 0. (Check this!) The value of f at this 
point is 18. It is left as an exercise to show that (11,8) is an optimal point. 








CASE 3. À #0 and Ag = 0. Then p? = 0 and pu? £0. 


[> Case? := A, = 0, u2 = 0: 
subs(Case3, NecessaryConditions); 
83 := RealDomain:-solve(%, vars) : 
fnormal( Matrix(subs(Case3, s3)),4); 
[2x — 28 + Ay (2a — 22) = 0, 2y — 22 + Ai (2y — 26) = 0, (x — 11)? + (y — 13)? 
10 0, odes — 19.0 = 0,0 = 0,0 = 0] 


xz = 5.176 y= 16.88 A=-1.515 0.= 


0. 0. 
x = 16.82 y=9.117 A, =—0.4849 0.=0. 0. 





This case is also infeasible as u3 is negative in both instances which indicates 
that Condition 2, g2, is violated. 


CASE 4. à #0 and à2 #0. Then p? = 0 and pu? = 0. 


[> Case4 := mı = 0, u2 = 0: 
subs( Case4 , NecessaryConditions); 
s4 := RealDomain:-solve(%, vars) : 
fnormal( Matrix(subs(Case4, s4 )), 4); 
[2x — 28 + A1 (2 * £ — 22) + Ay = 0, 2y — 22 + A (2y — 26) + A» = 0, 
(x — 11)? + (y — 13)? — 49.0 =0,2 + y — 19.0 = 0,0 = 0,0 = 0] 
xz = 12.77 y= 6.228 Aı = —0.4148 à2=3.926 0.=0. 0.=0. 
x = 4.228 y= 14.77 à= —1.585 z= —1.926 0.=0. 0.=0. 


This case is again infeasible. The functional values are f (12.77, 6.228) = 24.284 
and f(4.228, 14.77) = 109.720. These are not optimal values because they do 
not satisfy the sufficient conditions for A; for a relative minimum. Show that 
f (4.228, 14.77) is a relative maximum. How does this happen? 





Use Figure 4.21 which shows the contours of f, the constraints, and the 
six points found in the cases above to geometrically explain why each point 
appeared as a potential solution. 
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FIGURE 4.21: Infeasible Solutions and the Real Solution 


Example 4.27. Computational Constrained Nonlinear Optimization 
Redux. 


Maximize z = 2a? — 8x + y? — ôy 
subject to 
r+y<4 
ys<2 


We’ll use the template we developed in the previous example: Choose from 
the four values of A, and A2, substitute these into the necessary conditions 
system of equations, solve the system (using ju? as a variable), and analyze 
the solution. 


[> f:=(2,y) > 2a? — 8r +y 
gı := (x,y) >r +y: 

L g2:= (x,y) >y: 

> L:= (x,y, À, u) => f(x,y) F At (gi (x, y) +a z 4.0) 
L +2- (g2(%, y) + u3 — 2.0) : 

|> grad := Gradient(L(a, y, A, p), |£, y,A1, A2, 41, Mal) : 


2 





6y: 


[> vars := [£, y, Ads Ass u3] : 
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| > NecessaryConditions := [seq(G = 0,G in grad)]; 

4% —-8+A,=0 
2y—6+A,+A2=0 
at+yt+pi?—-40=0 

y + u2? — 2.0 = 0 

We're ready to begin working through the four cases. 
[> Casel := M = 0,A2=0: 
subs(Casel , NecessaryConditions); 


sl := RealDomain:-solve(%, vars) : 
Matrix(subs(Case1, s1)); 


[4c — 8 =0,2y -6=0,r+y+ pi — 4.0 = 0, y + u2 — 2.0 = 0] 
ead. y=3. 0=0 0=0 peel, p =-1.] 











The point (2,3) is not a solution; u? and p32 are both negative so both con- 
straints are violated. 


[> Case? := 1, = 0, u2 =0: 

subs(Case2, NecessaryConditions); 

s2 := RealDomain:-solve(%, vars) : 

Matriz(subs( Case2, s2)); 

[4c — 8 = 0, 2y — 6 + A. = 0, £ + y + u? — 4.0 = 0, y — 2.0 = 0] 
[zr=2. y=2. 0=0 å2=2. W =0. 0=0.] 

All necessary conditions are satisfied. Since the objective function is convex, 
the constraints are linear (and therefore convex), and A; for all binding con- 
straints are positive (A2 = 2), the sufficient conditions are met. This solution 
is the only solution that satisfies all the necessary and the sufficient conditions. 
Thus (2,3) is the optimal solution. 











[> Case? := à = 0, 2 =0: 
subs(Case3 , NecessaryConditions); 
83 := RealDomain:-solve(%, vars) : 
fnormal( Matriz(subs( Case3, s3)),4); 


[4a — 8 + Ay =0,2y — 6 + àı = 0, £ + y — 4.0 = 0, y + u2 — 2.0 = 0] 
[xz =1.667 y=2.333 0=0 à =1.333 0=0. oS —0.333] 





The point (1.667, 2.333) is not an optimal solution as 3 is negative violating 
Constraint 2. 
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[> Case, := u? = 0, u2 = 0: 

subs( Case4 , NecessaryConditions); 

s4 := RealDomain:-solve(%, vars) : 

Matriz(subs( Caseg , s4)); 

[4z — 8 +A, = 0, 2y — 6 + àı + à2 = 0, xz + y — 4.0 = 0, y — 2.0 = 0] 
a=? y=2. 0=0 Bat gah 0=0.] 

Case IV yields the same solution as Case II. 

Use Figure 4.22 which shows the contours of f, the constraints, and the 
points found in the cases above to geometrically explain why each point 
appeared as a potential solution. 














FIGURE 4.22: Infeasible Solutions and the Real Solution Redux 


Applications Using the Kuhn-Tucker Conditions Method 


Example 4.28. Maximizing Profit from Perfume Manufacturing. 

A company manufactures perfumes. They can purchase up to 1,925 oz of the 
main chemical ingredient at $10 per oz. An ounce of the chemical can produce 
an ounce of Perfume #1 with a processing cost of $3 per oz. On the other 
hand, the chemical can produce an ounce of the higher priced Perfume #2 
with a processing cost of $5 per oz. The Analytics department used historical 
data to estimate that Perfume #1 will sell for $(30 — 0.01x) per ounce if x 
ounces are manufactured, and Perfume #2 can sell for $(50 —0.02y) per ounce 
if y ounces are manufactured. The company wants to maximize profits. 
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MODEL FORMULATION. Let 


x = ounces of Perfume #1 produced 
y = ounces of Perfume #2 produced 
z = ounces of main chemical purchased 





Then 
Maximize P(x, y,z) = z - (30 — 0.012) + y- (50 — 0.02y) — (8a + 5y + 10z) 
eae 
subject to Revenue Cost 
ert+y<z 
z < 1925 


SOLUTION. Set up the model, define the Lagrangian function L, and use the 
techniques from the previous examples. 

|> P := (x,y,z) > < - (30 — 0.01x) + y- (50 — 0.02y) — (3x + 5y + 102) : 

gı := (x,y,z) >z +y-z: 

| g2 := (z,y, z) > z : 

[> Li= (x,y, Z, A, H) E F(z, Y, z) +à: (g(x, Y, z) + Ly _ 0.0) 

+ 2+ (g2(a, y, Z) + u2 — 1925.0) : 

| > grad := Gradient(L (2, y, z, A, u), [£, y, 2,1, 2]) : 





| > vars := [x, y, z, ài, Az, u2, u2] : 


| > NecessaryConditions := [seq(G = 0,G in grad)] : 
(NecessaryConditions); 
[ 27-0.022+ 1 =0 
45 —0.04y + A, =0 


—10— à+ Ag =0 





m? t+et+y—z=0 
L L z + u2? — 1925.0 = 0 


Once again, we’re ready to begin working through the four cases. A summary 
of the results from Maple is shown in Table 4.9. 
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TABLE 4.9: Perfume Application’s Four Cases 








CASE x y z Ài A2 u? fie 
I. 0 0 = = 
Il. 1350 1125 1925 0 10 —550 0 

III. 850 875 1725 —10 0 0 200 
IV. | 983.33 941.67 1925 —7.333 2.667 0 0 





Remarks: 
CASE I. 0L/0z = 0 becomes —10 = 0; this case is infeasible. 
CASE II. u? is negative violating Constraint 1. 
CASE III. A; < 0 and yp; > 0: Candidate Solution. 
CASE IV. js have different signs so not an optimal point. 


We have a concave profit function P, linear constraints, and A; (Ai = 
—10) is negative for all binding constraints. Thus, we have met the sufficient 
conditions for the point (850,875) to be the optimal solution. The optimal 
manufacturing strategy is to purchase z = 1725 ounces of the chemical and 
produce x = 850 ounces of Perfume #1 and y = 875 ounces of Perfume #2 
yielding a profit of P = $22,537.50. 

Consider the significance of the shadow price for A,. How do we interpret 
the shadow price in terms of this scenario? If we could obtain an extra ounce 
(A = 1) of the main chemical at no cost, it would improve the profit to about 
$2237.50 + $10. What would be the largest cost for an extra ounce of the 
main chemical that would still yield a higher profit? 


Example 4.29. Minimum Variance of Expected Investment Returns. 
A new company has $5,000 to invest to generate funds for a planned project; 
the company needs to earn about 12% interest. A stock expert has sug- 
gested three mutual funds, A, B, and C, in which the company could invest. 
Based upon previous year’s returns, these funds appear relatively stable. The 
expected return, variance on the return, and covariance between funds are 
shown in Table 4.10 below. 
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TABLE 4.10: Mutual Fund Investment Data 
A B C 


Expected Value 0.14 0.11 0.10 
Variance 0.2 0.08 0.18 








AB AC BC 
Covariance 0.05 0.02 0.03 





We use laws of expected value, variance, and covariance in our model. 


MODEL FORMULATION. Let x; be the number of dollars invested in fund j 
for j = 1, 2, 3, representing Funds A, B, and C. Our objective is to minimize 
the variance of the investment, so 
V = var(xz1 A + £2B + z3C)) 
= x£? var(A) + x3 var(B) + z2 var(C) + 22122 cov( AB) + 22123 cov(AC) 
+ 2x2x3 cov(BC) 
= 0.2027 + 0.0823 + 0.1823 + 0.102122 + 0.042123 + 0.062223 
Our constraints are: gı, the expected return is at least 12%, and gz, the sum 
of the investments is no more than $5000. We have the NLP 
Minimize V = 0.20a7 + 0.0823 + 0.1823 + 0.102122 + 0.042123 + 0.062273 
subject to 
0.142; + 0.11z2 + 0.1023 > 0.12 - 5000 = 600 
zı + £2 + £3 < 5000 





SOLUTION. Set up the Lagrangian function L. 
L(x, A, u) = f(x) + Ar (gi (x) — wi — 600) + A2(g2(x) + #3 — 5000) 


(Why is u? subtracted rather than added?) 
Define the functions for the model and calculate the necessary conditions. 


| > with( Student| MultivariateCalculus)) : 


[> V =r > 0.202? +0.0823 +0.1823 + 0.102122 + 0.042123 + 0.062273 : 
gı := x > 0.14%, + 0.11472 + 0.1023 : 

L 92:= £ — T1 + T2 + 73: 

|> grad := Gradient( L(x, A, p), [£1, £2, £3, A1, Az) : 
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Notice that we have again left yı and u2 out of the list of variables—the com- 
plementary slackness conditions are taken care of by considering our standard 
four cases (4 = 9 (number of constrainte)), 


|> NecessaryConditions := [seq(G = 0,G in grad)] : 
(NecessaryConditions); 


0.40 zı + 0.10 z2 + 0.04 £3 + 0.14 ài + Ag = 0 
0.16 £2 + 0.10 zı + 0.06 x3 + 0.11 Ay + Az = 0 
0.36 x3 + 0.042; + 0.06 x2 + 0.10 Ay + Az = 0 
0.142, +0.11 z2 +0.10 £3 — 41? — 600.0 = 0 














L £1 + £2 + £3 + u2? — 5000.0 = 0 
Let Maple do the work. 
| > s := LagrangeMultipliers(V (x), [gi(x) — p? — 600, go(x) + u2 — 5000}, 
[£1, £2, £3, 1, Az, H1, H2], output = detailed) : 
S := fnormal([s], 4) : 
The output would have overflowed the page and would be very hard to read, 
so we’ll use some “Maple Magic” to put it in an easier-to-handle form. 


| > Lgnd := (op(map(lhs, S[1, 1..7])) | V) : 
Sdata := map(rhs, Matrix(S)[.., [1..7, 10]]) : 
(Lgnd, S'data); 





zı T2 T3 At A2 Hı H2 V 
1905.0 2381.0 714.3 13810.0 —904.8 0.0 0.0 1881000.0 
702.2 3118.0 1180.0 0.0 639.9 —6.382 I 0.0 1600000.0 
702.2 3118.0 1180.0 0.0 639.9 6.382 I 0.0 1600000.0 


0.0 0.0 0.0 0.0 0.0 —24.49I —70.71 0.0 
0.0 0.0 0.0 0.0 0.0 —24.497 70.71 0.0 
0.0 0.0 0.0 0.0 0.0 24.497 —70.71 0.0 
0.0 0.0 0.0 0.0 0.0 24.49 I 70.71 0.0 
1250.0 2929.0 1028.0 5958.0 0.0 0.0 —14.397 1787000.0 
1250.0 2929.0 1028.0 5958.0 0.0 0.0 14.397 1787000.0 





Inspect the results to see that only the first row has a feasible solution; the 
rest have a negative u? (imaginary p;). Notice that, in row 1, both A; > 0, 
that is, both constraints are binding. Then 
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> sı[l. 3]; 
Optimum := eval(V (x), 81); 
[a4 = = 1904.761905, x2 = 2380.952381, 73 = 714. 2857143] 
1.880952381 10° 


Check the Hessian. 

| > H := Student:- VectorCalculus:-Hessian(V (x), [a1, £2, £3); 
0.40 0.10 0.04 
0.10 0.16 0.06 

L 0.04 0.06 0.36 


The Hessian matrix H has all positive leading principal minors. Therefore, 
since H is always positive definite, then our solution is the optimal minimum. 
The expected return is 12.0% found from gı(x)/5000 = (0.14 - 1904.8 + 0.11 - 
2381. + 0.10 - 714.2) /5000. 








Exercises 


1. Solve the following constrained problems using the Kuhn-Tucker Condi- 
tions (KTC) approach. 


(a) Minimize z = x? + y? subject to z + 2y = 4. 

(b) Maximize z = (x — 3)? + (y — 2)? subject to x + 2y = 4. 
) 
) 


(c) Maximize z = x? 


Ary + y? subject to x7 + y? = 1. 





2 











Ary + y? subject to z? + y? = 4 and g + 2y = 4. 





(d) Maximize z = x 


2. Maximize Z = 3X? + Y? +2XY + 6X + 2Y subject to 2X — Y = 4. 
Did you find the maximum? Explain. 


3. Two manufacturing processes, fı and f2, both use a resource with b units 
available. Maximize fı(xı) + fo(a2) subject to zı + z2 = b. 
If f(1(x1) = 50 — (zı — 2)? and fə(£2) = 50 — (x2 — 2)?, analyze the 
manufacturing processes using the KTC approach to 
(a) determine the amount of xı and x2 to use to maximize fı + fo, 
(b) determine the amount b of the resource to use. 


4. Use the Kuhn-Tucker Conditions to find the optimal solution to the fol- 
lowing nonlinear problems. 


(a) Maximize f(x,y) = —x?— y? + xy + 7x + 4y subject to 2x + 3y > 16 
and —5xa + 12y < 20. 
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(b) Minimize f(x,y) = 2x + ry + 3y subject to z? +y > 3 and 2.5 — 
0.52 —y <0. 

(c) Minimize f(x,y) = 2x + xy + 3y subject to z? +y > 3, r+0.5>0, 
and y > 0. 

(d) Maximize f(x,y) = —(x — 0.4)? — (y — 5)? subject to —x + 2y < 4, 
x? +y? < 14, and x,y > 0. 





5. Minimize z = x? + y? 


subject to 
22+ y < 100 
x+y < 80 





























6. Maximize z = —(x—4)?+zy— (y 
subject to 
2x + 3y < 18 
2r+ y<8 
a 
Projects 


Project 4.1. A newspaper publisher must purchase three types of paper 
stock. The publisher must meet the demand, but desires to minimize costs 
in the process. An Economic Lot Size model is chosen to assist them in their 
decisions. An Economic Order Quantity (EOQ) model with constraints where 
the total cost is the sum of the individual quantity costs is C(Q1, Q2, Q3) = 
C(Q1) + C(Q2) + C(Q3) where 


C(Qi) = aidi/Qi + hi - (Qi/2) 
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where 


d; = the order rate 
hi = the cost per unit time for storage (holding cost) 
Q;/2 = is the average inventory/amount on hand 
a= is the cost of placing an order 
The constraint is the amount of storage area available to the publisher for 


storing the three kinds of paper. The items cannot be stacked, but can be laid 
side by side. The available storage area is S = 200 sq ft. 


Table 4.11 shows data that has been collected on the publisher’s usage and 
costs. 


TABLE 4.11: Paper Usage and Cost Data 





TyPE I. Tyre II. Type III. units 
d; 32 24 20 rolls/week 
aj 25 18 20 dollars 
hi 1.00 1.50 2.00 dollars 
Si 4 3 2 sq ft/roll 


REQUIRED. 


(a) Find the paper quantities that give the unconstrained minimum total 
cost, and show that these values would not satisfy the constraint. What 
purpose do these values serve? 


(b) Find the constrained optimal solution by using Lagrange Multipliers 
assuming all 200 sq feet is used. 


(c) Determine and interpret the shadow prices. 


Project 4.2. In the tank storage problem, Example 4.22 (pg. 175), determine 
whether it is better to have cylindrical storage tanks or rectangular storage 
tanks of 50 cubic units. 


Project 4.3. Use the Cobb-Douglas function P(L, K) = aL*K° where L is 
labor and K is capital, to predict output in thousands, based upon amount 
of labor and capital used. Suppose the price of capital and labor per year are 
$10,000 and $7,000, respectively. The company estimates the values of a as 
1.2, a = 0.3, and b = 0.6. Your total cost is assumed to be T = Pr L + Pkk, 
where Pr and P; are the price of capital and labor. There are three possible 
funding levels: $63,940, $55,060, or $71,510. Determine which budget yields 
the best solution for the company. Interpret the Lagrange multiplier. 
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Problem Solving with Linear Systems of 
Equations Using Linear Algebra Techniques 











Objectives: 
(1) Set up a system of equations to solve a problem. 


(2) Recognize unique solutions, no solution and infinite solutions as 
results. 





5.1 Introduction 


In the mid-1800s, a large number of ironwork bridges were constructed as 
railways crossed the continents. One of the popular designs for trusses, rigid 
triangular support structures, was the Warren truss with vertical supports. 1 
The center span of the 1926 Bridge of the Gods (Figure 5.1) over the Columbia 
River is a nice example. 





FIGURE 5.1: Bridge of the Gods Warren Truss with Vertical Supports 





1See Frank Griggs’ “The Warren Truss,” Structure, July, 2015, for a brief history of the 
Warren truss; available at https: //www.structuremag.org/?p=8715. 
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Warren trusses supported carrying heavy loads. The civil engineering tech- 
nique “Method of Joints” is used to analyze the forces acting on the truss. 
Individual parts of the truss are connected with rivets, rotatable pin joints, 
or welds that permit forces to be transferred from one member of the truss to 
another. 

Figure 5.2 shows a truss that is fixed at the lower left endpoint pı, and 
can move horizontally at the lower right endpoint p4. The truss has pin-joints 
at pı, p2, p3, and p4. A load of 10 kilonewtons (kN) is placed at joint p3; the 
forces on the members of the truss have magnitude f1, fo, fs, fa, and fs as 
indicated in the figure. The stationary support member has both a horizontal 
force F, and a vertical force Fo (see Figure 5.3); the horizontally movable 
support member has only a vertical force F.? 









10000N 4 


FIGURE 5.2: A Warren Truss with Vertical Supports 


If the truss is in static equilibrium, the forces at each joint must sum to the 
zero vector. If there were net nonzero forces, the joint would move—the truss 
would not be in static equilibrium. Therefore, the corresponding components 
of the vector must also be zero; i.e., the sum of the horizontal components of 
the forces at each joint must be zero and the sum of the vertical components 
must be zero. 





2Variants of this problem appear in many textbooks; see, e.g., Burden and Faires 
[BurdenFaires2005], Fox [Fox2011Maple]. 
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y 


F, = sin (0) F 





F; = cos(0) - F 


FIGURE 5.3: Force Vector Components 


A system of linear equations will model the forces. We will build this model 
and solve its linear system later in the chapter. 





es 
5.2 Introduction to Systems of Equations 


In this chapter, we illustrate the use of systems of equations to solve real- 
world applications from the sciences, engineering, and economics. Previously, 
we have seen examples of linear systems in solving discrete dynamical systems 
and in model fitting with least squares. Maple’s LinearAlgebra package has a 
number of functions and programs that will be very useful for solving these 
problems. 

There are exactly three possibilities for a system of linear equations: the 
system has a unique solution, an infinite number of solutions, or no solution. 
Consider the following two-dimensional system of linear equations: 


aix + biy = cy 





agx + boy = C2 


Cy 
C2 


Row reduce the augmented matrix to find one of the three possible results. 


The augmented matrix for this system is 


_ {a1 by 
Me P bz 





Unique Solution. Row reduction of A gives 
1 Old 
0 lie 


which yields the unique solution (x, y) = (d,e). 
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Infinitely Many Solutions. Row reduction of A gives one of three possi- 


bilities. First, 
1 by /ay C1 /a4 
0 0 0 





which yields solutions (x,y) = (ait, cı — bit) with arbitrary t. Second, if 
a, = 0 and bı Æ 0, then row reduction gives 


0 lid 
0 0] 0}’ 
and the solution is (t,d) for arbitrary t. Third, if a; 4 0 and bı = 0, then 


row reduction gives 
1 Ojd 
0 0} 0]? 


and the solution is (d,t) for arbitrary t. 





No Solution. Row reduction gives 
1 hid 
0 Oje 


with e Æ 0. Then there is no solution as Ox + Oy = e Æ 0 is impossible. 





These alternatives are represented visually in Figure 6.4. 














(a) No Solution (b) Unique Solution (c) Infinite Solutions 


FIGURE 5.4: The Three Alternatives for Solutions of a Linear System 


Describe the possibilities for three linear equations in three dimensions; 
that is, for the intersection of three planes in 3D. 

We will rely on the matrix forms in recognizing our solutions in this chap- 
ter. 

The following description is from the Details of the LinearAlgebra Package 
Help page.’ 





3Enter help("LinearAlgebra, Details") for the full description. 
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e Maple’s LinearAlgebra package is an efficient and robust suite 
of routines for doing computational linear algebra. 


e Full rectangular and sparse matrices are fully supported at the 
data structure level,* as well as upper and lower triangular matri- 
ces, unit triangular matrices, banded matrices, and a variety of 
others. Further, symmetric, skew-symmetric, hermitian, and 
skew-hermitian are known qualifiers that are used appropri- 
ately to reduce storage and select amongst algorithms. 


According to V. Z. Aladjev, 


The LinearAlgebra module has been designed to accommodate 
different sets of usage scenarios: casual use and programming use. 
Correspondingly, there are functions and notations designed for easy 
casual use (sometimes at the cost of some efficiency), and some func- 
tions designed for maximal efficiency (sometimes at the cost of ease- 
of-use). In this way, the LinearAlgebra facilities scale easily from 
first-year classroom use to heavy industrial usage, emphasizing the 
different qualities that each type of use needs.” 


The LinearAlgebra commands we will use most are: 


GaussianElimination: performs Gaussian elimination on the matrix A and 
returns an upper triangular matrix U. 


ReducedRowEchelonForm: performs Gaussian elimination on the matrix A 
and returns the reduced row echelon form of A. 


(Note for those who have studied linear algebra: Both of these commands use 
LinearAlgebra’s LUDecomposition function to determine the upper triangular 
factor of A.) 

Recall there are many ways to enter a matrix. One of the easiest is to use 
Maple’s Matrix palette. Enter help("worksheet, matpalette") for details. 

Begin with a simple 3 x 3 system of linear equations. 


Example 5.1. A 3 x 3 System. 
Determine the solution of the following system of linear equations. 
2x + 4y +4z=4 
zr+t3y+ z=4 
x+ 3y + 2z = —1 














4For those interested in the internal structure of matrices, etc., enter the command 
help("linearalgebra,about, data"). 

5V. Z. Aladjev, Computer Algebra Systems: A New Software Toolbox for Maple, Fultus 
Corp., 2004, pg. 451. 
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Load the linear algebra package, enter the system’s matrix in augmented 
form, row reduce the matrix using Gaussian elimination, and then interpret 
the results. 





| > with(|MCLinear Algebra) : 
| > A := (2,1, -1)|(4, 3, 3)|(4, 1, 2)| 4,4, —1)); 
2 44 A 
A:=]1 3 1 4 
t E En 
| > ReducedRowEchelonForm(A); 
100 2 
0 10 1 
L 0 0 1 -1 


We interpret the results as a unique solution: (x,y,z) = (2,1, —1). 
Let’s increase the dimension by one. 


Example 5.2. A 4 x 4 System. 
Determine the solution of the following system of linear equations. 


2x + 4y + 4z +2w=2 








2x + 2y z w=-l 





xz+4y— z-2w=1 
—@ + 2y + w=1 


[> B := Matriz(([2,4, 4, 2,2], 2,2,1, —1, —1], [1,4, -1, -2, 1], [-1,2,0, 1, 1]]); 
2 4 4 2 2 

2] =f =i 

1 4 -1 -2 1 


i | -12 0 1 4 
|> ReducedRowEchelonForm(B); 








1 0 0 -1 0 
0 10 0 O 
00 1 1 0 
000 0 1 


We interpret the results: The system has no solution as the last row of the 
row reduced augmented matrix gives the equation 0 = 1. 
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ee 
Exercises 


Use Maple to solve the following system of equations and interpret the results. 


1. x—5y=—154 








x — 3y= —84 
2. lla — 6y= 494 
x + Ty=—23 
3. 9e+ y= 56 
6x — 5y = 128 
4. 6xr+ y= 50 
18x + 3y= 150 
5. 2+ 2y+3z=5 
z-— y+6z=1 
3a — 2y =4 


6. x+y+3z-—4w=12 
3zr+y—2z— w=0 





T. 2e+3y+4z=5 
xz— y+2z=6 
3x — 5y — z=0 





8. 2x-—3y+2z=21 
xz+4y— z=1 
—“x+2y+ z=17 








9. 3x -— 4y + 5z — 4w=12 
zr— y+ z-2w=0 
2r + y+2z+3w=52 
2x — 2y + 2z — 3w= 1 











Projects 


Project 5.1. Model a solution methodology for Ax = b using linear algebra 
and illustrate your method with Maple commands. 


Project 5.2. Let A be a random n x (n + 1) matrix representing a linear 
system of n linear equations in n variables. Estimate the probability that the 
system has a unique solution when 
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(a) n=3. 
(b) n=4. 
(c) n=5. 
(d) n>5. 








5.3 Models with Unique Solutions Using Systems of 
Linear Equations 


Let’s use the method of joints to analyze the Warren truss from the Introduc- 
tion. 


Example 5.3. Analysis of a Truss by the Method of Joints. 

A segment of a Warren truss with vertical supports that is fixed at the lower 
left endpoint pı, and can move horizontally at the lower right endpoint p4 
is displayed in Figure 5.5. The pin-joints of the truss are at pı, po, p3, and 
pa. A load of 10 kilonewtons (kN) is placed at joint p3; the forces on the 
truss with magnitudes fi, fo, fs, fa, and fs are indicated in the figure. The 
stationary support member has a horizontal force F and a vertical force F; 
the horizontally movable support member has only a vertical force F}. 


P2 





FIGURE 5.5: A Warren Truss with Vertical Supports Segment 


Since the truss is in static equilibrium, Newton’s Law for equilibrium of 
forces implies that the sum of the force vectors at each joint must be the zero 
vector. Therefore, the sum of the horizontal components of the forces and the 
sum of the vertical components must each be zero. 
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Begin with joint pı, the fixed joint at the lower left of the truss. Both fı 
and f are forces into pı, while forces F, and F> pull away from pı. Force fi 
acts at angle of 7/4 which resolves into the horizontal component (2/2) fı 
and vertical component (2/2) fı. These forces are at equilibrium, so that 


2 2 
-A+ 2 A+ f=0 and spe nei 


Next, we consider joint pə at the top of the truss segment. Both fı and fs pull 
away, while f4 acts into the joint. Resolving the angular forces and summing 
gives 


By a and a fot he=0. 


Complete the model for the forces at each joint and place the force vector 
equations in Table 5.1. Verify that these are correct! 





TABLE 5.1: Forces on the Truss’s Joints 





Joint Horizontal Component Vertical Component 
p -ht+?@Ath=0 —-h+8f=0 
p2 -@fAt@fr=0 -Zfħh-ht+tif=0 
P3 —fot fs =0 fs — 10000 = 0 
pa -%5 fa — fs =0 -i fa -F =0 





Write this model as a system of equations in matrix notation Ax = b with 
8 equations in 8 unknowns: 


-1 0 0 v22 1 0 0 0} |F 0 
0 -1 0 vy2/2 0 0 0 0| |F 0 
0 0 0 -V2/2 0 0 v32 o0||F 0 
0 0 0 -v¥2/2 0 -1 1/2 O}|fA} | 0 
0 0 0 0 -1 0 0 illfml | oo 
0 0 0 0 0 1 0 0} | fs 10000 
0 0 0 0 0 0 =</3/2 -1| | fa 0 
0 0 -1 0 0 0 -1/2 Ol lf 0 


Enter the augmented matrix, and then reduce the matrix using 
ReducedRowEchelonForm. Interpret the results to solve the problem. 
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As this matrix is large (for entering by hand, that is), use the Matrix 
palette to enter it. Set the size to 8 x 9, click the “Insert Matrix” button, then 
fill in the entries. 


-1 0 0 2 1 0 0 0 0 
0-1 0 Z 0 0o 0 0 0 
0 0 0 -2 0 0o 8 0 0 
spelt o 20 -%2 0 -1 4 0 0 
o 0 0 0 -~ 0 0 1 0 
0 0 0 0 0 1 0 0O 10000 
0 0 0 0 0 0 -8 -1 0 
o 0 a 0 0 0 =! 0 0 

> evalf (ReducedRowEchelonForm(T)); 
[1.0 00 0.0 0.0 0.0 0.0 0.0 0.0 


0.0 1.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 1.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 1.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 1.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 1.0 0.0 10000.0 

0.0 0.0 0.0 0.0 0.0 0.0 1.0 —27320.50810 
L | 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 23660.25405 


Note that we used evalf to force decimal arithmetic. 

We interpret the solution of the linear system as the magnitude of the 
forces: Fy = 0, Fa = —23660.25, F3 = —13660.25, fı = —33460.65, fo = 
23660.25, f3 = 10000, f4 = —27320.51, and fs = 23660.25. 

This problem is continued as an exercise at the end of the section. 


—23660.25405 
13660.25405 
—33460.65216 
23660.25405 


Se ee SS Se 
ooo o o Oo 6 








Wassily Leontief, recipient of the 1973 Nobel Prize in Economics, explained 
his input-output model showing the interdependencies of the U.S. economy 
in the April 1965 issue of Scientific American.® He organized the 1958 Amer- 
ican economy into an 81 x 81 matrix. The 81 sectors of the economy, such 
as steel, agriculture, manufacturing, transportation, and utilities, each repre- 
sented resources that rely on input from the output of other resources. For 
example, the production of clothing requires an input from manufacturing, 
transportation, agriculture, and other sectors. The following is a brief exam- 
ple of a Leontief model and its solution. 





SWassily W. Leontief, “The Structure of the U.S. Economy,” Scientific American, April, 
1968, pp. 25-35. 
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Example 5.4. A Leontief Input-Output Economic Model. 
Consider an open production model, one that doesn’t consume all of its pro- 
duction, where to produce 1 unit of output (units are in millions of dollars): 


the petroleum sector requires 0.1 units of itself, 0.2 units of transportation, 
and 0.4 units of chemicals; 


the textiles sector requires 0.4 units of petroleum, 0.1 units of itself, 0.15 
units of transportation, 0.3 units of chemicals, and 0.35 units of manufac- 
turing; 


the transportation sector requires 0.6 units of petroleum, 0.1 units of itself, 
and 0.25 units of chemicals; 


the chemicals sector requires 0.2 units of petroleum, 0.1 units of textiles, 
0.3 units of transportation, 0.2 units of itself, and 0.1 units of manufac- 
turing; 


the manufacturing sector requires 0.1 units of petroleum, 0.3 units of trans- 
portation, and 0.2 units of itself. 


Table 5.2 shows the technology matrix representing this model. 


TABLE 5.2: Leontief Input-Output Table for Our Five-Sector Economy 


Input Consumed per Unit of Output 








Petrol. Textiles Transport. Chemicals Manufact. 
g Petrol. | 0.1 0.4 0.6 0.2 0.1 
È| Textiles | 0.0 0.1 0.0 0.1 0.0 
S| Transport. | 0.2 0.15 0.1 0.3 0.3 
=| Chemicals | 0.4 0.3 0.25 0.2 0.0 
Ž Manufact. 0.0 0.35 0.0 0.1 0.2 








The entries of Table 5.2 form the consumption matrix C for this economy. 


We use C to answer Leontief’s question, “Is there a production level that will 
balance the total demand?” Let the vector x be the total production of the 
economy, and let the vector d be the final demand, the demand external to 
production. Then Cx is the intermediate demand, the amount of production 
consumed internally by the individual sectors of the economy. The Leontief 
Exchange Input-Output model, or Production Equation, is 


x=0x+d. 
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Let I be the identity matrix. We can solve for x as follows. 


x=Cx+4+d 
x—-Cx=d 
(I-C)x=d 
x=(I-C)"‘d 


If the economy produces 900 million dollars of petroleum, 300 million dol- 
lars of textiles, 850 million dollars of transportation, 800 million dollars of 
chemicals, and 750 million dollars of manufacturing, how much of this pro- 
duction is internally consumed by the economy? 

Use Maple to enter the augmented matrix L = [C |d] and row reduce it. 
Remember to load LinearAlgebra first. 


[> C := ((0.9,0, —0.2, —0.4, 0) | (—0.4, 0.9, 0.15, —0.3, —0.35) 
| (—0.6, 0, 0.9, —0.25, 0) | (—0.2, —0.1, —0.3, 0.8, —0.1) 
(—0.1,0, —0.3, 0, 0.8)) : 





d := (900, 300, 850, 800, 750) : 
L := (C | d); 
0.9 0.4 0.6 0.2 —0.1 ad 
0 0.9 0 —0.1 0 300 
L:= | —0.2 -015 09  —0.3 —0.3 850 
—0.4 —0.3 —0.25 0.8 0 800 
0 —0.35 0 —0.1 0.8 750 





> evalf ~ (ReducedRowEchelonForm(L), 5); 
[1.0 0.0 0.0 0.0 0.0 6944.2 


0.0 1.0 0.0 0.0 0.0 1070.0 
0.0 0.0 1.0 0.0 0.0 5620.6 
0.0 0.0 0.0 1.0 0.0 6629.8 
i | 0.0 0.0 0.0 0.0 1.0 2234.3 


We interpret the unique solution as the amounts the sectors need to produce 
to meet the total demand: 








petroleum: xı = 6944.2 
textiles: x2 = 1070.0 
transportation: 73 = 5620.6 
chemicals: x74 = 6629.8 
manufacturing: x5 = 2234.3 
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We studied the method of least squares in Chapter 5 of Volume 1. We used 
multivariable calculus to minimize the sum of squares of the errors leading to 
least squares in Chapter 4, Volume 2. Now, we will consider least squares 
curve fitting as solving a system of linear equations. 


The sum of squares of the errors from fitting a parabola f(x) = ax?+br+e 
is given by 


S= 5 (yi — (ax? + bxi + c))”. 
i=1 


To minimize S, first set the first partial derivatives equal to 0 and then solve 
the resulting system. After a little algebra, we have the normal equations 


oF =0= Soma? —a- Dat- Ee 
oF =0= Sum —a-Soad—b- a? om 
oF oE u- Soa? boc Sd 


Rewrite these equations as a system of linear equations. 


aS rib X rte =>) ya? 
aS +b X afte: So a=) yim (5.1) 
a-S a+b X ate: S 1 =J y 


Observe the pattern in the equations; what would the system be for a cubic fit? 








Example 5.5. A Least Squares Model as a System of Equations. 
Fit a quadratic model f(x) = ax? + br + c to the following data: 


z|O 1 2 3 4 
y | 62 50 39 18 2 





Substitute the following into (5.1). 
y1=5, S02, =10, $ af =30, > 2) =96, 
Y= 171, X yiti =190, S yiz? = 400 


We have the linear system 


354a + 96b + 30c = 400 
96a + 30b + 10c = 190 
30a+ 106+ 5c= 171 








218 Linear Systems of Equations 


Enter the augmented coefficient matrix for the system into Maple and row 
reduce to find the solution. 


[> A := Matriz(([354, 96, 30, 400], [96, 30, 10, 190], [30, 10, 5, 171]]) : 
| > ReducedRowEchelonForm(A); 








197 

i 0 0 -1 

326 

0o 1 0 -3% 

11557 

L 0 0 1 T35 


The output yields a = —197/111, b = —326/36, and c = 11557 /185. Therefore, 
the quadratic model for our data is 





197 , 326 11557 
ra=- Br B+ EE 


Check the fit. 
| > with(plots) : 





r = 197.2 326, , 11557 , 
> f :=x > -i 37 © T -i85 ` 


[> Pts := zip((x, y) > [x,y], [0.-4], []) : 
| > DataPlot := pointplot(Pts, symbol = solidcircle, symbolsize = 18) : 


|  fPlot := plot(f,—1..5, —10..70, thickness = 2) : 
| > display(DataPlot, fPlot); 


707 








-1 0 1 2 3 5 





-10- 





The plot shows visually that the fit captures the trend of the data. 
Our next example of an application of systems of linear equations is a 
different method of fitting a curve to data: this time, smoothly connecting 
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data points via cubic polynomials to form a piecewise curve. These curves are 
called “cubic splines,” cubic for the third degree and spline after the flexible 
draftsman’s ruler originally used to draw the curves. The data points are called 
“knots.” Pierre Bézier, a French design engineer at Renault, introduced using 
splines for designing auto bodies in the 1960s. Splines are ubiquitous today, 
appearing in numerous applications. TrueType, developed by Apple in the 
1990s, uses two-dimensional quadratic splines to form the characters on your 
phone or computer screen, and PostScript, developed by Adobe in the 1980s, 
uses two-dimensional cubic splines for printed characters. 


Example 5.6. Natural Cubic Splines as a System of Equations. 
Fit a natural cubic spline’ to the data 


c| 7 14 21 28 35 42 
125 275 800 1200 1700 1650 





We will need 5 third order equations. 





for x € [7,14]: Si (a) 

for x € [14,21]: S2(x) 

for x € [21,28]: $3(x) = a3, 32° + a3,2£° + a3.12 + a3,0 
(x) 
(x) 


[ 

[ 
for x € [28,35]: S4(x) = a4 3z’ 4,22" a41% + G40 
for x € a 42]: Ss 


2 
= 41,30 Q1,2U 41,12 T 1,0 


= 3 2 
= 42,30 a22% Q2,1% T 2,0 























2 
= 45,30" + Q5,2% a5, 1% + 45,0 


There are 20 unknowns: a; j, so we need 20 equations to uniquely solve 
for these unknowns. The (x,y) data pairs give 10 equations, S,(7) = 125, 
Sı(14) = 275, S2(14) = 275, S2(21) = 800, and so forth. We need 10 more 
equations. 

To make the spline smooth, we require the curve to have both first and 
second derivatives equal, i.e., match slope and concavity, at the knots (data 
points). So, for i = 1..4, make 


S; (Ti+) z Sipi (Ti+1) and S; (Liga) = Siha (£i+1)- 


These requirements give us 8 more equations. For the last two equations, use 
the condition for “natural” splines: 


Si(zxı)=0 and S5(%—) =0 


We have the requisite 20 equations. Let’s use Maple to build the equations, 
the augmented coefficient matrix, and find the spline. 





"Natural cubic splines mimic the original engineer’s flexible ruler by requiring that the 
second derivative of the spline at the two end data points be 0. “Clamped” cubic splines 
use specified values for the first derivative at the two end data points. 
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First, we load the needed packages and enter the data. 


| > with(LinearAlgebra) : 
| with(plots) : 
[>N:=6: 
X := [(7k) $k = 1..6]: 
Y := [125, 275, 800, 1200, 1700, 1650] : 
Pts := zip( (x,y) > [x,y], X,Y): 
Pts := [[7, 125], [14, 275], [21, 800], (28, 1200], [35, 1700}, [42, 1650]] 
We define the individual cubic functions using an “indexed procname” to make 


the code more general. 





[> S := proc(z) 

local 3; 

i := op(procname); 

return(a; 3: £? +a; 2'2? +4;1-°2+4;0; 

L end proc: 

Now define the natural cubic spline function for our 6 point data set. 


|> Spline := proc(z) 
piecewise(a < X2, Sı(x), £ < X3, So(x), £ < X4, S3(x), 
x < X5, Sa(x), S5(x)); 
end proc: 
In order to avoid some convoluted statements, we’ll define the cubic segment’s 


derivative functions directly. 





|> dS := proc(z) 
locali; 
i := op(procname); 
return(3 - a;,3 < £? + 2aj2- £+ aj,1); 
end proc: 
d2S := proc(x)locali; 
i := op(procname); 
return(6 - a; 3: £ + 2a;,2); 
end proc: 
Let’s calculate the 3 sets of equations that make up the system of 20 linear 





equations. 

[> KnotEgs := seq('S;(X;) = Yi, Si(Xia1) = Yign',i=1..N—1): 

SmoothEgs := seq('dS;(Xi41) = dS ini (Xfi + 1])’,i = 1..N — 2), 
seq(‘d2S;(Xj41) = d2S 541 (Xi41)', i =1.N— 2) : 

NaturalEqs := d2S1ı(X1) = 0, d2S5(X6) = 0: 

Displaying the sets as vectors make them much easier to read. 
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| > (KnotEgs); 
343 a1,3 + 49a1,2 +7411 + a1, = 125 
2744 a1.3 +196 a12 +1441) + a1,0 = 275 
2744 a2, 3 + 196 a22 + 14421 + a2, = 275 
9261 a2 3 + 441 a2,2 + 21 a2 + a2, = 800 
9261 a3.3 + 441 a3,2 + 21.431 + a3,o = 800 
21952 a3,3 + 784a3,2 + 28431 + a3,o = 1200 
21952 a4,3 + 784 4,2 + 28 44,1 + a4,o = 1200 
42875 a4, 3 + 1225 a4,2 + 35 a4,1 + G49 = 1700 
42875 a5 3 + 1225 45,2 + 3545.1 + 45,9 = 1700 
L 74088 a5 3 + 1764 a5,2 + 42 a5, ı + a5 o = 1650 

> (SmoothEqs); 

588 a1,3 + 28 a12 + a1,1 = 588 a2 3 + 28 42,2 + 421 
1323 a2,3 + 42 a2, 2 + a2,1ı = 1323 a3,3 + 42 a3 2 + a3, 
2352 a3,3 + 56 a3,2 + a3,ı = 2352 a4,3 + 56 a4,2 + G41 
3675 a4,3 + 70 a4,2 + a4,ı = 3675 a5 3 + 70 a5, 2 + a5,1 



































8441.3 ini 241.2 = 84 a2,3 + 242.9 
126 a2,3 T 2 02.2 = 126 43,3 T 2 43,2 


168 43,3 T 2 a3,2 = 168 a4,3 T 2 a4,2 








L 210 a43 T 2 a4.2 = 210 45,3 T 2 5,2 
[> (NaturalEqs); 

42 Q1,3 + 2 aig = 0 

252 45,3 +2 a5,2 = 0 


Get a list of the variables of the system: the coefficients of the equations. 


| > indets([KnotEqs]) : 

vars := convert(%, list); 

vars := [a1 0, 41,1, @1,2, 41,3, 42,0; 42,1; 42,2, A2,3, 43,0; 43,1; 43,2, 43,3, 44,0, 
4,1, 44,2, 44,3, 45,0; 45,1, 45,2, 45,3] 





Use LinearAlgebra’s GenerateMatrix to create the augmented matrix for the 
system. (To see the full matrices, execute interface(rtablesize=50) before exe- 
cuting the next Maple command.) 
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| > A := GenerateMatrix([KnotEgs, SmoothEqs, NaturalEqs], vars, 
augmented = true); 


1 7 49 343 0 0 0 0 0 0 
1 14 196 2744 0 0 0 0 0 0 
0 0 0 0 1 14 196 2744 0 0 
0 0 0 0 1 21 441 9261 0 0 
0 0 0 0 0 0 O 0 1 21 
A:—-| 90 0 0 0 0 0 O 0 1 28 
0 0 0 0 0 0 O 0 0 0 
0 0 0 0 0 0 O 0 0 0 
0 0 0 0 0 0 O 0 0 0 
0 0 0 0 0 0 0 0 0 0 


L 20 x 21 Matrix 
Compute the reduced row echelon form of A. 


[> RA := ReducedRowEchelonForm( A); 
00 0 
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20 x 21 Matrix 


Use LinearAlgebra’s GenerateEquations to capture the values of the coeffi- 
cients. 
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| > theFit := GenerateEquations(RA, vars); 



































oe = — 79000 _ _ 71475 _ — 23825 
theFit := [a1 o = —25, a1,1 = Ta63 > 41,2 10241? 1,3 = 71687” 

_ 511375 — _ 695900 — 28725 — _ 40750 — _ 1525100 
42,0 209 7 72,1 1463 7 22,2 931 7 22,3 71687 %3,0 209”? 
a2, = 1340575 q4 , — — 362850 — 1825 q4 = 3953300 g, , — _ 2768225 
3,1 1463 ° 3,2 10241 ? “3,3 — 3773? %4,0 209 7° “4,1 1463? 
aa o = 964350 q}, — — -7275 qrp — — 6559200 q, _ 3539275 q, — _ 597150 
4,2 10241 > “4,3 10241? “5,0 209 9 95,1 1463.9 5,2 10241 ” 


_ 33175 
a5,3 = Figs7! 





Set the values of the coefficients. 
| > assign(theFit) : 
And plot the results. 


| > SplinePlot := plot(Spline(x), x = X1..Xn) : 
DataPlot := pointplot(Pts, symbol = solidcircle, symbolsize = 18) : 
display(SplinePlot, DataPlot); 
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The natural cubic spline fits the data very well. Caution: do not use the spline 
outside the data! Plot over a larger range to see why. 
Use Maple’s 


CurveFitting:-Spline( Pts, x,3, endpoints = ’natural’); 


to verify that we have found the curve. 
In general, to connect N knots (data points), we need N — 1 cubic equa- 
tions. To set up the system of equations, we will need: 


1. 2(N — 2) equations matching the endpoints of the N — 1 connecting cubic 
curves, 
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2. (N — 1) equations matching the first derivatives at the N — 1 connecting 
points of the cubic curves, 


3. (N—1) equations matching the second derivatives at the N — 1 connecting 
points of the cubic curves, 


4. e either 2 equations for natural cubic splines setting the second derivative 
of the left end of the first cubic and the right end of the last cubic to 0, or 
e 2 equations for clamped cubic splines setting the first derivative of the 
left end of the first cubic and the right end of the last cubic to given values. 


Combining these equations gives a total of 4n — 4 equations for our system 


with (N — 1) x 4 variables. 
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Exercises 


1. A Bridge Too Far. Revisit the truss of Example 5.3. Set up and solve a 
model of the truss: 


(a) if the angles at joint pı and joint p4 are both 7/3. 


(b) if the angles at joint pı and joint p4 are both 7/4. 


(c) if the angles at joint pı and joint p4 are both 7/6. 


2. Consider the Leontief Input-Output model of Example 5.4. 


(a) Determine the solution for the following technology matrix: 





Petrol. Textiles Transport. Chem. Manufact. 
Petrol. 0.2 0.3 0.6 0.2 0.1 
Textiles 0.0 0.2 0.0 0.1 0.0 
Transport. 0.2 0.15 0.2 0.3 0.3 
Chem. 0.4 0.35 0.25 0.2 0.0 
Manufact. 0.0 0.35 0.0 0.1 0.2 


(b) If the economy now produces 1,000 million dollars of petroleum, 400 
million dollars of textiles, 950 million dollars of transportation, 750 
million dollars of chemicals, and 950 million dollars of manufacturing, 
how much of this production is internally consumed by the economy? 


3. Use the least squares technique to fit the model f(x) = kx? to the follow- 


ing data: 





£05 1.0 


2.0 





y|0.7 34 7.2 12.4 20.1 
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4. Use the least squares technique to fit the model W = kL? to the following 
data: 


Length L | 12.5 12.625 12.625 14.125 145 14.5 17.27 17.75 
Weight W | 17 16 17 23 26 27 43 49 





5. Use cubic spline interpolation to fit the following sets of data points. 
(a) w]/1 2 3 
y|9 27 48 


(b) x|0.5 10 15 20 25 
y| 0.7 34 7.2 124 20.1 














Projects 


Project 1. One of the main decisions that manufacturers have is to choose 
how much of a product to produce. Considering that an economy consists 
of producers of many things, this problem gets very complex very quickly. 
Leontief received the Nobel Prize in Economics in 1973 “for the development 
of the input-output method and for its application to important economic 
problems” What we will discuss now comes from his work, hence the name 
Leontief Model. An example of such a model follows. 


First, divide an economy into certain sectors. In reality, there are hundreds of 
sectors, but for the sake of simplicity, we will posit three sectors: manufactur- 
ing (m), electronics (e), and agriculture (a). We must decide how many units 
of each sector to produce, expressing the amounts as a production vector 


Now, let’s say that the public wants 100 units of manufacturing, 200 units 
of electronics, and 300 units of agriculture. Put this data into an (external) 
demand vector 


100 
d = | 200 
300 


Now, one might say that if this is what the economy wants, then this is 
what should be produced; i.e., x = d. The problem, however, is that the 





8See https: //www.nobelprize.org/prizes/economic-sciences/1973/leontief/biographical/. 
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production of certain resources actually consumes resources as well. In other 
words, it takes stuff to make stuff. How much is consumed by production can 
be expressed by using an input-output matriz. 
outputs 
m e a 
m |0.1 0.2 0.3 
e |0.2 0.1 0.3 
a |01 0.1 0.2 


inputs 


The matrix T indicates that the production of 1 unit of manufacturing uses up 
0.1 units of manufacturing, 0.2 units of electronics, and 0.1 units of agriculture. 
The production of 1 unit of electronics uses up 0.2 units of manufacturing, 0.1 
units of electronics, and 0.1 units of agriculture. Finally, the production of 1 
unit of agriculture uses up 0.3 units of manufacturing, 0.3 units of electronics, 
and 0.1 units of agriculture. So, not only must we account for what the people 
want, but we must also make up for what is used up in the process of producing 
what the people want. The amount consumed by production is called internal 
demand, it is given by the product Tx. Hence, what we produce needs to 
satisfy both internal demand and external demand. That is, 


x=Tx+d 


Using the matrix T above, determine the production vector x, also called the 
total demand, and the internal demand Tx for our three-sector economy. 


Project 2. Wassily Leontief (1906-1999), the Russian-born, Nobel Prize win- 
ning American economist who, aside from developing highly sophisticated eco- 
nomic theories, also enjoyed trout fishing, ballet and fine wines. In this project, 
we will look at a very simple special case of his work called a closed exchange 
model. 


Long, long ago, far, far away in the land of Eigenbazistan, in a small country 
town called Matrixville, there lived a Farmer, a Tailor, a Carpenter, a Coal 
Miner and Slacker Bob. The Farmer produced food; the Tailor, clothes; the 
Carpenter, housing; the Coal Miner supplied energy; and Slacker Bob made 
High Quality 100-Proof Moonshine, half of which he drank himself. Let’s make 
the following assumptions: 


e Everyone buys from and sells to the central pool (i.e., there is no outside 
supply or demand). 
e Everything produced is consumed. 


This type of economy is called a closed exchange model. Table 5.3 specifies 
what fraction of each of the goods is consumed by each person in our town. 
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TABLE 5.3: Matrixville’s Closed Exchange Model 





Food Clothes Housing Energy Moonshine 
Farmer | 0.25 0.15 0.25 0.18 0.20 
Tailor | 0.15 0.28 0.18 0.17 0.05 
Carpenter | 0.22 0.19 0.22 0.22 0.10 
Coal Miner | 0.20 0.15 0.20 0.28 0.15 
Slacker Bob | 0.18 0.23 0.15 0.15 0.50 





So for example, the Carpenter consumes 22% of all food, 19% of all clothes, 
22% of all housing, 22% of all energy and 10% of all High Quality 100 Proof 
Moonshine. 


If the matrix J — T is invertible, the total demand equation, x = Tx +d can 
be solved for x. 

a 
b) If possible, determine C7}. 


(a) Determine C = I — T using Table 5.3. 

(b) 

(c) If feasible, compute the total demand x for this closed exchange model. 
) 


(d) For those who have studied linear algebra: what is the relation between 
the total demand x and the eigenvalues and eigenvectors of T, if any? 








5.4 Stoichiometric Chemical Balancing and Infinitely 
Many Solutions 


During your life you have witnessed numerous chemical reactions. How would 
you describe them to someone else? How could you obtain quantitative infor- 
mation about the reaction? Chemists use stoichiometric chemical equations? 
to answer these questions. 

By definition, a chemical equation is a written representation of a chemical 
reaction, showing the reactants and products, their physical states, and the 
direction in which the reaction proceeds. In addition, many chemical equations 
designate the conditions necessary (such as high temperature) for the reaction 
to occur. A chemical equation provides stoichiometric information about a 
chemical reaction, only if the equation is balanced. 

For a chemical equation to be balanced, the same number of each kind 
of atom must be present on both sides of the chemical equation. The French 





See LibreTexts’ Chemistry Library “Stoichiometry and Balancing Equations”. 
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chemist Antoine-Laurent de Lavoisier!? (1743-1794) introduced the law of 
conservation of matter during the latter half of the eighteenth century. The 
conservation law states that matter can neither be created nor destroyed. de 
Lavoisier’s principles of naming chemical substances are still used today. 

John Dalton!! (1766-1844), developer of the first useful atomic theory 
of matter in the early 1800s, was the first to associate the ancient idea of 
atoms with stoichiometry. Dalton concluded, while studying meteorology, that 
evaporated water exists in air as an independent gas. Solid bodies cannot 
occupy the same space at the same time, but obviously water and air could. If 
the water and air were composed of discrete particles, evaporation might be 
viewed as mixing their separate particles. He performed a series of experiments 
on mixtures of gases to determine what effect properties of the individual gases 
had on the properties of the mixture as a whole. While trying to explain the 
results of those experiments, Dalton hypothesized that the sizes of the particles 
making up different gases must be different. Dalton wrote 


it became an object to determine the relative sizes and weights, 
together with the relative numbers of atoms entering into such com- 
binations... Thus a train of investigation was laid for determining the 
number and weight of all chemical elementary particles which enter 
into any sort of combination one with another. !? 


According to Dalton’s Atomic Theory of Matter of 1803, all substances are 
composed of atoms. During a chemical reaction atoms may be combined, 
separated, or rearranged, but not created or destroyed. The postulates include: 


1. All matter is composed of atoms, indivisible and indestructible objects, 
which are the ultimate chemical particles. 


2. All atoms of a given element are identical, both in mass and in properties. 
Atoms of different elements have different masses and different properties. 


3. Compounds are formed by combination of two or more different kinds of 
atoms. Atoms combine in the ratio of small whole numbers. 


4, Atoms are the units of chemical change. A chemical reaction involves only 
the combination, separation, or rearrangement of atoms. 


Let’s examine the meaning of chemical equations and compare them to math- 
ematical equations to gain some insights. A chemical equation identifies the 
starting and finishing chemicals as reactants and products: reactants — prod- 
ucts. 





10See https: //www.britannica.com/biography / Antoine-Lavoisier. 

11See https: //www.britannica.com/biography /John-Dalton. 

!2The quote is from Dalton’s notes for a lecture to the Royal Institution in London in 
1810. 
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Chemical Equations vs. Mathematical Equations 


We usually think of an equation like x+2z = 3x as purely mathematical, even 
if x represents a physical quantity like distance or mass. A chemical equation 
may look like a mathematical equation, but it describes experimental obser- 
vations: the quantities and kinds of reactants and products for a particular 
chemical reaction. Reactants appear on the left side of a chemical equation; 
products on the right. For example, see Table 5.4. The products, which are the 
result of combining the reactants, are known from experimental observations 
— they are not derived mathematically. 


TABLE 5.4: Chemical Equation vs. Mathematical Equation 





Chemical Equation Mathematical Equation 
Combustion of Propane Linear Equation 
C3Hg + 502 —> 3CO2 + 4H20 xr =2£r+3 





Left: reactants; right: products | No standard left/right order 


Balanced when it reflects the | Solved when all values that 
conservation of matter give a true statement are found 





In fact, combining the same reactants at different concentrations or tempera- 
tures can often produce different products from the same reactants. First-year 
chemistry students cannot predict these effects, and are generally not asked 
to predict them. On the other hand, a chemical equation is similar to a math- 
ematical equation in that there are certain restrictions on what may appear 
on the left and right sides. These mathematical rules represent the effects of 
conservation of matter on the reaction—conservation of matter says that no 
atoms are destroyed or created during a chemical reaction. 


How a Chemist Approaches Balancing an Equation 


Many chemical equations, in the view of the chemists, can be balanced by 
inspection, that is, by the process of “trial and error.” The objectives of a 
chemist are: 


e Recognize a balanced equation. 
e Recognize an unbalanced equation. 


e Balance by inspecting chemical equations with given reactants and prod- 
ucts. 


e Write the unbalanced equation when given compound names for reactants 
and products. 
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According to chemistry textbooks, the step-wise procedure to balance equa- 
tions is: 


STEP 1. Determine what reaction is occurring: know the reactants, the prod- 
ucts, and the physical states. 


STEP 2. Write the unbalanced equation that summarizes the reaction 
described in Step 1. 


STEP 3. Balance the equation by inspection, starting with the most compli- 
cated molecules. Do not change the identities of any reactant or product. 


Thus, balancing the equation is done by inspection, a “trial and error” process 
that some students catch on to quickly, but others struggle with in frustration. 


Balancing Equations with Systems of Equations 


Balancing chemical equations can be an application of solving a system of 
linear equations. Placing variables as the multipliers for each compound and 
making equations for each type of atom in the reaction results in a system 
of linear equations. This system is usually under-determined, there are more 
variables than equations. An under-determined system has infinitely many 
solutions. Our goal is to provide a procedure to find an integer solution from 
among the infinitely many solutions. This method is best grasped through 
an example which also lays the foundation for balancing more complicated 
oxidation-reduction equations. 


Example 5.7. A Sulfur Dioxide Reaction. 
Balance the chemical equation 


Se + O2 ae SOo2 


STEP 1. Introduce multipliers for each compound. There are three com- 
pounds, so we identify three multipliers: {x£1, £2, £3}. The reaction equation 
becomes 

1186 + £202 — 13502 


STEP 2. Identify all elements: S (sulfur) and O (oxygen). Set up an equation 
for each element involving the amounts (multipliers) for that chemical on the 
reactant side equaling the amount on the product side. 

S: 621 = 73 

O: 2x2 = 2x3 
STEP 3. Create a homogeneous system; that is, put all variables on the left 


side of the equation and zero on the right. 


S: 62; — 23 =0 
O: 2x2 — 2x73 = 0 
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STEP 4. Write the system in matrix notation Ax = b where A is the coeffi- 
cient matrix and b is the column vector of zeros. Then form the augmented 
matrix for the system. 


T 


STEP 5. Apply Gaussian elimination (row reduction) to the augmented 
matrix [A |b] to obtain [H |c], where H is in reduced row echelon form. 


e 


Read the system’s solution: xı — z3/6 = 0 and z2 — z3 = 0. 


Balancing the chemical equation means finding the smallest whole numbers x1, 
z2, and x3 solving the system. Since there are more equations than unknowns, 
there is an infinite number of solutions. Note that x3 is part of every equation, 
so #3 can equal anything. The fraction 1/6 is a coefficient of z3 in one of the 
equations, so choose x3 to be the smallest whole number that eliminates the 
fraction; i.e., choose z3 = 6. Then x; = 1 and x2 = 6. 

The balanced reaction equation is 


Se + 6 O2 = 6S0». 


We are presented with the following more complicated unbalanced equa- 
tion that would be difficult to solve by inspection. 


Example 5.8. Ethylenediamine Mixed with Dinitrogen Tetroxide. 
Balance the reaction 


CoHgNo + N2004 —> Nə + CO2 + H20 
STEP 1. Introduce five multipliers xı through z5, one for each of the 5 terms. 


xı C2Hs8N2 + £2 N204 — z3 No + x4 CO2 + z5 H20 


STEP 2. List the equation for each element. 
C: 24, = T4 
H: 8xı = 245 
N: 2a, + 2z2 = 2x3 
O 


: Ano = 2444+ 25 


STEP 3. Write the augmented matrix for the homogeneous system. 


20 0 -1 0O 
8 0 0 0 -2]0 
2 2 -2 0 O10 
04 0 -2 -1O 


[A|b] = 


232 Linear Systems of Equations 


STEP 4. Maple’s ReducedRowEchelonForm yields 
|> B := ((2,8, 2,0) | (0,0, 2, 4) | (0,0, —2, 0) | (-1,0, 0, —2) | (0, -2,0, —1) 








| [(0,0,0,0)); 

| > LinearAlgebra:- ReducedRowEchelonForm(B); 
100 0 -4/0 
010 0 -Z|0 
0 0 1 0 -3]0 

L 0001 -40 


STEP 5. The least common multiple of the denominators for x5’s coefficients 
is 4; choose z5 = 4. Then zı = 1, ro = 2, v3 = 3, and z4 = 2. 








The balanced equation is 


CoHgNo + 2 N2014 > 3 Nə +2 CO2 +4H20 





Check the result—count the elements in the equations to make sure each 
balances. 





es 
Exercises 
In Exercises 1 to 4 balance the basic chemical reactions. 
1. Copper plus silver nitrate displacement /redux reaction: 
Cu + AgNO, — Ag+ Cu(NO3)2 
(Elements: copper (Cu), silver (Ag), nitrogen (N) and oxygen (O).) 
2. Zinc and hydrochloric acid replacement reaction: 


Zn + HCl — ZnCl + He 





(Elements: zinc (Zn), hydrogen (H), and chlorine (Cl).) 
3. Ferrous oxide to ferric oxide reaction: 
FeO + O2 —> Fe203 
(Elements: iron (Fe) and oxygen (O).) 
4. Calcium hydroxide and phosphoric acid neutralization reaction: 
Ca(OH) + H3PO4 —> Ca3(OH4)2 + H20 


(Elements: calcium (Ca), oxygen (O), hydrogen (H), and phosphorus (P).) 
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De 
Projects 


Project 1. Explain why we need to use the smallest integer solution for the 
balanced chemical reactions. Provide examples. 


Project 2. The reduction of kerosene takes place in three steps 


Ci0H16 ele Oz — CO + Hə 
CO + Oz = CO2 
Hə + O> -> H0O 





Balance the three reactions simultaneously. 
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Review of Regression Models and Advanced 
Regression Models 











Objectives: 

(1) Understand the concepts of correlation and linearity. 
(2) Build and interpret nonlinear regression models. 

(3) Build and interpret logistic regression models. 

(4) Build and interpret Poisson regression models. 

(5) Understand when to use each type of regression model. 


(6) Understand the use of technology in regression. 


The Philippine National Statistics Coordination Board (NSCB) has collected 
data on acts of violence committed by terrorists and insurgents over the past 
decade. Over the same period, the NSCB has collected data on the population 
such as education levels (literacy), employment, government satisfaction, eth- 
nicity, and so forth. The government is looking to see which areas it can target 
for improvements that might reduce the number of violent acts committed. 
What should the administration do to improve the situation? 

In this chapter, we will discuss several regression techniques as background 
information and keys to finding adequate models. We do not try to compre- 
hensively cover regression, but we will use real examples to illustrate the 
techniques used to gain insights, predict, explain, and answer scenario-related 
questions such as the Philippines question above. 

The forms we will study are simple linear regression, multiple regression, 
exponential and sine nonlinear regression, binary logistic regression, and Pois- 
son regression. ! 





1 This chapter is adapted from Fox and Hammond, “Advanced Regression Models: Least 
Squares, Nonlinear, Poisson and Binary Logistics Regression Using R” in Marquez and Lev, 
Data Science and Digital Business, Springer, 2019. 
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6.1 Re-Introduction to Regression 


Simple linear regression? builds the model f(x) = ax + b by minimizing the 
sum of errors squared. As an optimization problem, this is 


n 


Minimize SSE = 5 (yi — FD) (6.1) 


i=1 


Most programs will do the computations for (6.1) easily; a calculator can 
handle small data sets. Excel, JMP, R, SAS, Stata, and SPSS are among the 
most commonly used programs. We will continue to use Maple. 

For the ordinary least squares method of linear regression, we are most 
concerned, in this chapter, with the mathematical modeling and the use of the 
model for explaining or predicting the phenomena being studied or analyzed. 
Other standard uses are variable screening and parameter estimation. We will 
make use of the basic diagnostic measures, percent relative error and residual 
plots, for analysis. Percent relative error is calculated as 

— f(x) 


%RE = Ë - 100. (6.2) 


Yi 
A rule of thumb for the magnitude of percent relative error is that most should 
be less than 20%, and those near where a prediction is needed should be less 
than 10%. 

A plot of residuals versus the model provides a visualization of the goodness 
of a model’s fit. Examine the plot for trends or patterns such as those shown 
in Figure 6.1. If a pattern is apparent, the model is not adequate, even though 
we might have to use it. If there is no visual pattern, then we may take that 
as evidence that the model is adequate. 

When should we use simple linear regression and when should we use some- 
thing more advanced? Let’s begin an answer with the concept of correlation. 


Correlation 


Many decision makers have misconceptions about correlation in linear regres- 
sion that is often engendered by non-technical, common-usage definitions of 
the term. The Ozford English Dictionary defines 


corrolation: noun 
1. A mutual relationship or connection between two or more things. 





?Linear regression was introduced in Chapter 5, Model Fitting and Linear Regression, 
of Volume 1. 

3“Teast squares” was introduced by Gauss in the early 1800s; Legendre was the first to 
publish the technique in 1805. Galton introduced the term “regression” in his classic 1885 
study of children’s size relating to their parents’ size. 
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(c) Fanning Pattern 
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(b) Curved Pattern 
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(d) An Outlier 














(e) Linear Pattern 
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FIGURE 6.1: Patterns in the Residual Plot 


A statistical definition in the context of linear regression is given in terms of 
Pearson’s product-moment correlation coefficient, often reduced to correlation 


coefficient, or even just correlation, 


correlation is a measure of the strength of the linear relationship 


between two variables.* 


Linear relationship is a key term in the definition. Some definitions for corre- 
lation merely state it is a measure of the relationship between two variables, 





4See, e.g., Lane et al, Introduction to Statistics, Online ed., pg. 170. 
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and do not mention linearity. Definitions like the one from the help menu of 
a popular spreadsheet 


CORREL: Returns the correlation coefficient of the Arrayl and Array2 
cell ranges. Use the correlation coefficient to determine the relationship 
between two properties. For example, you can examine the relationship 
between a location’s average temperature and the use of air condition- 
ers. 


help fuel misconceptions. 

It is no wonder decision makers have misinterpretations, thinking that 
correlation completely measures the relationship between variables. The most 
common misunderstandings that decision makers have expressed include: 


e Correlation implies causation. 


Correlation measures everything. 


e When the correlation value is large and it looks linear (visually), then the 
relation must be linear. 


Model diagnostics are not needed when the correlation value is large. 


The regression package that was taught in class is what must be used. 


The next section begins with examples that illustrate some of the miscon- 
ceptions surrounding correlation and shows possible corrections that we use 
in our mathematical modeling courses. One diagnostic test that we now cover 
is the “common-sense test”—does the model answer the question and does it 
provide realistic results? 





6.2 Modeling, Correlation, and Regression 


Return to the Philippines scenario introduced at the beginning of the chapter. 
The researcher actually began with linear regression. For example, in 2008, 
an analysis of the literacy index per region versus the number of significant 
acts of violence (by terrorists or insurgents) produced the linear model 


f(a) = —1.5463x + 146.54 


with correlation coefficient —0.3742 or R? = 0.14 (R? is the square of the 
correlation coefficient). The model is plotted in Figure 6.2. 
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FIGURE 6.2: Philippines: Literacy v. Significant Acts of Violence, 2008 


Within all the data, Figure 6.2 represents one of the “best” resulting overall 
models—note that R? is only 0.14 which represents a really poor linear fit 
or a very weak linear relationship. We will return to this scenario again later 
in the chapter, but suffice it to say, linear regression is not useful to explain, 
predict, or answer the questions the Philippine government had posed. 








6.3 Linear, Nonlinear, and Multiple Regression 


Many applications are not linear. For example, a pendulum is modeled by 
the differential equation 6” + w? sin(@) = 0; this equation can be “linearized” 
by replacing sin(@) with 6, a good linear approximation for small values of 8. 
Two widely used rules of thumb for relating correlation to the likelihood of a 
linear relationship come from Devore [D2012] for math, science, and engineer- 
ing data, shown in Table 6.1, and a more liberal interpretation from John- 
son [J2012] for non-math, non-science, and non-engineering data, given in 
Table 6.2. 
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TABLE 6.1: Linearity Likelihood for Math, Science, and Engineering Data; 
Devore 


Correlation p Relationship 





0.8 < |p| < 1.0 Strong linear relationship 
0.5 < |p| < 0.8 Moderate linear relationship 
0.0 < |p| < 0.5 Weak linear relationship 


TABLE 6.2: Linearity Likelihood for Non-Math, Non-Science, and Non- 
Engineering Data; Johnson 


Correlation p Relationship 





0.5 < |p| < 1.0 Strong linear relationship 
0.3 < |p| < 0.5 Moderate linear relationship 
0.1 < |p| < 0.3 Weak linear relationship 

0.0 < |p| < 0.1 No linear relationship 


In our modeling efforts, we emphasize the interpretation of p ~ 0. A very 
small correlation coefficient can be interpreted as indicating either no linear 
relationship or the existence of a nonlinear relationship. Many neophyte ana- 
lysts fail to pick up on the importance of potential nonlinear relationships in 
their interpretation. 

Let’s look at an example where we attempt to fit a linear model to data 
that may not be linear. 


Example 6.1. Exponential Decay. 

Build a mathematical model to predict the degree of recovery after discharge 
for orthopedic surgical patients. There are two variables: time t in days in 
the hospital, and a medical prognostic index for recovery y with larger values 
indicating a better prognosis. The data in Table 6.3 comes from Neter et al. 
[NKNW1996, p. 469]. 


TABLE 6.3: Hospital Stay vs. Recovery Index Data 


t| 2 5 7 10 14 19 26 31 34 38 45 52 53 60 65 
y|54 50 45 37 35 25 20 16 18 13 8 11 8 4 6 
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First, a scatterplot of the data. 


| > t:= [2, 5, 7,10, 14, 19, 26,31, 34, 38, 45, 52, 53, 60, 65] : 
y := (54, 50, 45, 37, 35, 25, 20, 16, 18, 13,8, 11,8, 4,6] : 
| Pts := zip((x,y) = [x,y], t, y) : 
| > pointplot( Pts, labels = [days, ‘Recovery index}, labeldirections = 
[horizontal, vertical], title = "Hospital Stay vs. Recovery Index"); 
Hospital Stay vs. Recovery Index 
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The scatterplot shows a clear negative trend. 
Check the correlation p. 


> p := Statistics:- Correlation(t, y); 
‘RZ = p°; 
p := —0.941052825413336 
L R? = 0.885580420218423 


Using either rule of thumb, the correlation coefficient p = —0.94 indicates a 
strong linear relation. With this value in hand, looking at the scatterplot leads 
us to think we will have an excellent linear regression model. 

Let’s determine the linear model, calling it RI. 





| > Statistics:-Fit(a- x + b, t, y, x); 
RI := unapply(evalf (%, 6), £); 
L RI := x œ —0.752508 x + 46.4604 
It’s time for diagnostics. Calculate the residuals and percent relative errors, 
then plot the residuals. 





|> Residuals := [seq(y; — RI(t;), i = 1..15)] : 


r duals, 
> RelErr := soq( 22t: -100,i = 1.15) : 
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Look at the worst percent relative error values. 





[> SRE := sort(RelErr); 
SRE(1..2], SRE|—2.. — 1); 
ls [—57.46925000, —44.57907500], [(67.25200000, 140.8770000] 
| > pointplot(ResidualPts, symbol = soliddiamond, symbolsize = 16, 
title = "Residuals"); 
Residuals 
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We’ve obtained the model, y = f(t) = 49.4601 — 0.75251t. The sum of squared 
error is 451.1945, the correlation is —0.94, and R?, the coefficient of determi- 
nation which is the correlation coefficient squared, is 0.886. These are all 
indicators of a “good” linear model. 

But... 
Although some percent relative errors are small, others are quite large, with 
eight of the fifteen over 20%. The largest two positive errors are ~ 67% and 
140%, and the largest two negative errors are ~ —45% and —57%. How much 
confidence would you have in making predictions with this model? The resid- 
ual plot clearly shows a curved pattern. The residuals and percent relative 
errors show that we do not have an adequate model. Advanced courses in 
statistical regression will show how to attempt to correct this inadequacy. 

Further, suppose we need to predict the index when time was 100 days. 


[> RI(100); 





| —28.790400 


A negative value is clearly unacceptable and makes no sense in the context of 
our problem since the index is always positive. This model does not pass the 
common sense test. 

With a strong correlation of —0.94, what went wrong? The residual plot 
diagnostic shows a curved pattern. In many regression analysis books, the 
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suggested first-attempt cure for a curved residual is adding a nonlinear term 
that is missing from the model. Since we only have (t,y) data points, we’ll 
add a nonlinear term azg? to the model. 


Example 6.2. Multiple Linear Regression. 
Fit a parabolic model f(x) = ag+a12+a22? to the Hospital Stay vs. Recovery 
Index Data of Table 6.3. 


This model, although a parabola, is linear in the “variables” ap, a,, and 
a2. 
Let’s find the fit with Maple. 


[> with( Statistics) : 


| > Fit(ap + a, -2+a2-27,t,y, £): 
QF := inepslncevall 4), x); 
QF := «++ 55.82 — 1.7102 + 0.014812? 


Use the formula RS = 1 — SSE/SST to compute R?, the coefficient of deter- 
mination, for the quadratic fit. 


[> SSE := sum((yi — QF(ti))?, i = 1..15); 
u := Mean(y): 
SST := sum((yi — u)’, i = 1..15); 
SSE := 72.34557123 
SST := 3943.333333 





SSE 


eas 
>a SST’ 





L RSq := 0.9816537013 


These diagnostics look quite good. The sum of the squares of the errors is much 
smaller. The coefficient of determination R? is larger, and the correlation has 
increased to 0.99. 

Let’s plot the new residuals to see if there is still a curved pattern. 


> QResidualPts := [seq (|ti; yi — QF(ti)], i = 1..15)] : 
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| > pointplot(QResidualPts, symbol = soliddiamond, symbolsize = 16, 
title = "Quadratic Fit’s Residuals"); 
Quadratic Fit's Residuals 
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Use the model to predict the index at 100 days. We find QF(100) = 32.92. 
The answer is now positive, but again does not pass the common-sense test. 
The quadratic function is now curving upwards toward positive infinity. Graph 
it! An upward curve is an unacceptable outcome; an increasing curve for the 
index indicates that staying in the hospital for an extended time leads to a 
better prognosis. We certainly cannot use the model to predict the values of 
the index for t beyond 60. 

We must try a different model. Looking at the original scatterplot leads 
us to attempt fitting an exponential decay curve. 


Example 6.3. Nonlinear Regression: Exponential Decay. 
Fit an exponential decay model f(x) = age” to the Hospital Stay vs. Recov- 
ery Index Data of Table 6.3.° 


Maple’s NonlinearFit command from the Statistics package does not do 
well unless we specify reasonable estimates for the parameters of our model, 
here ag and a1. 


| > NonlinearFit(ag - exp(a,-x),t,y, £) : 
NEDF := unapply(evalf(%, 5), x); 
NEDF := x ++ 5.6790 10779 e*-0782# 


Test the fit at t = 100. 
[> NEDF(100) 





2.561123364 1016 





5 Adapted from Neter et al. [NKNW1996] and Fox [Fox2012]. 
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A ridiculous value! 

Care must be taken with the selection of the initial values for the unknown 
parameters (see, e.g., Fox [Fox2012]). Maple yields good models when the 
calculations are based upon good input parameters. How can we get reasonable 
approximations to ao and a1? Let’s use a standard technique: transform the 
model to make it linear. Apply logs to our model. 

Y = ao ete 


In(y) = In (age*!”) 
= In(ao) + ax 


Relabel In(y) = Y and In(ao) = Ao to obtain 
Y= Ao + ax 


Now use Maple to fit the transformed model. 


|> LinearFit( Ao + a1 - x,t, In~ (y), £); 
L 4.03715886613379 — 0.0379741808112946 - x 
Good estimates to use for the parameters are 





ao = e4° = 56.667 and a, = —0.03797. 


[> NonlinearFit(ao « exp(a1 - £), t, Y, £, 
initialvalues = [ao = 56.667, a; = —0.03797]); 
NEDF := unapply(evalf(%,5), £); 
NEDF(100); 
NEDF := x ++ 58.607 e7 9:0395862 
L 1.118797159 


This prediction is much closer to what we expect. A common technique used 
to test the result is to recompute the model with different, but close, initial 
values. Try it! 

Note how close the nonlinear-fitted parameter values are to the estimates. 
The difference comes from where least squares is applied in the computation: 
to the logs of data points, rather than to the points. Write the general least 
squares formulas for the model and the transformed model to see the differ- 
ence. Also, compare the model from Maple’s ExponentialFit to the one we 
have found. 

For a nonlinear model, linear correlation and R? have no meaning. Com- 
puting the sum of squared error gives SSE = 45.495. This value is sub- 
stantially smaller than SSE = 451.1945 obtained by the linear model. The 
nonlinear model appears reasonable. Now check the nonlinear model’s residual 
plot for patterns. 








| > EDResidualPts := [seq([t;, y; — NEDF(t;)],i = 1..15)] : 
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| > pointplot(EDResidualPts, symbol = soliddiamond, symbolsize = 16, 
title = "Exponential Decay Fit’s Residuals"); 


Exponential Decay Fit's Residuals 
+ 








Figure 6.3 shows the exponential decay model overlaid on the data 


Hospital Stay v. Recovery Index 


Recovery index 
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FIGURE 6.3: Exponential Decay Fit to Hospital Recovery Data 


The percent relative errors are also much improved—only four are larger 
than 20%, and none are beyond 36.3%. Verify this! 
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Using this model to predict the index when t = 100 gave y = 1.12. This 
new result passes the common-sense test. This model would be our recom- 
mendation for the hospital’s data. 

The text’s PSMv2 library contains NonlinearRegression, a program we 
wrote for nonlinear regression in Maple that prints a diagnostic report, and 
then returns a fit of the specified model. 


[> with(PSMv2) : 


| > Describe( NonlinearRegression); 

# Usage: NonlinearRegression(X_data, Y_data, model, param_estimates) 
NonlinearRegression( ) 

| > model := ag - exp(ay - x); 

paramestimates := [ag = 50, a, = —0.04]; 

Nonlinear Regression(t, y, model, paramestimates) 


Coefficient Standard Error T-Statistic P-value ] 
ag 58.607 1.4722 39.810 0.0 
| a, —0.039586 0.0017113 —23.132 0.0 | 
Model m(x) = 58.607 e~ 9-939°862 


We see that both coefficients are statistically significant (P-values < 0.05). 
Just for illustration, change the model to a+ b - exp(c- x) and rerun 
NonlinearRegression. 





| > model? := a + b- exp(c: x); 
paramestimates2 := [a = 2.5,b = 57,c = —0.04]; 
Nonlinear Regression(t, y, model2, paramestimates2 ) 


Coefficient Standard Error T-Statistic P-value 








a 2.4302 1.9655 1.2364 0.23995 
b 57.332 1.8284 31.356 0.0 
c —0.044604 0.0048777 —9.1445 0.0 


L Model m(x) = 2.4302 + 57.332 e70-044604x 
Select the model with: 


[> rhs(%); 





L 2.4302 + 57.332 e7 0-044604% 


The new constant term a is not statistically significant (P-value > 0.05). 
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ee 
Exercises 


In Exercises 1 through 4 compute the correlation coefficient and then fit the 
following data with the given models using linear regression. 


EEEE 
y[1 122 4 





(a) y=at+ be 
(b) y = az? 
2. Data from stretching a spring: 
x (x107%) | 5 10 20 30 40 50 60 70 80 90 100 
y (x10°) | 0 19 57 94 134 173 216 256 297 343 390 





(8) yan 
(b) y=a+br 
(€) y = aa? 


3. Data for the ponderosa pine: 
a | 17 19 20 22 23 25 28 31 32 33 36 37 39 42 


y | 19 25 32 51 57 71 113 140 153 187 192 205 250 260 





(a) y=an 

(b) y=a+br 

(c) y =ar? 

(Dya 
) 


4. Fit the model y = ax?/? to Kepler’s planetary data: 





Body Period Distance from sun (m) 
Mercury | 7.60 x 10° 5.79 x 101° 
Venus | 1.94 x 107 1.08 x 101? 
Earth | 3.16 x 107 1.5 x 101 

Mars | 5.94 x 107 2.28 x 104 
Jupiter | 3.74 x 108 7.79 x 101! 
Saturn | 9.35 x 108 1.43 x 101? 
Uranus | 2.64 x 10° 2.87 x 101? 


Neptune | 5.22 x 10° 4.5 x 101? 
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5. Fit the following data, Experience vs. Flight time, with a nonlinear expo- 
nential model y = a eè”. 








Crew Total Crew Total 
Experience Flight Experience Flight 
(months) Time (months) Time 
91.5 23.79812835 116.1 33.85966964 
84 22.32110752 100.6 25.21871384 
76.5 18.59206246 85 25.63021418 
69 15.06492527 69.4 18.52772164 
61.5 14.9960869 53.9 12.88023798 
80 24.63492506 112.3 31.42422451 
72.5 18.87937085 96.7 26.299381 
65 18.60416326 81.1 19.6670982 
57.5 13.6173644 65.6 15.21024716 
50 15.62588379 50 11.977759 
103 28.46073261 120 36.33722835 
95.5 30.39886433 104.4 29.35965197 
88 25.06898139 88.9 21.78073392 
80.5 21.30460092 73.7 21.72030963 
73 21.98724383 57.8 16.19643014 








Projects 
Project 6.1. Write a program without using statistical programs /software to 
(a) compute the correlation between the data, and 


(b) find the least squares fit for a general proportionality model y is propor- 
tional to x”. 





Project 6.2. Write a program without using statistical programs/software to 


(a) compute the correlation between the data, and 


(b) find the least squares fit for a general polynomial model y = ag + a,x + 
Gov? +--+ + Anz”. 
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6.4 Advanced Regression Techniques with Examples 


In this section, we will consider sine regression, one-predictor logistics regres- 
sion, and one-predictor Poisson regression. First, consider data that has an 
oscillating component. 


Nonlinear Regression 


Example 6.4. Model Shipping by Month. 

Management is asking for a model that explains the behavior of tons of mate- 
rial shipped over time so that predictions might be made concerning future 
allocation of resources. Table 6.4 shows logistical supply train information 
collected over 20 months. 


TABLE 6.4: Total Shipping Weight vs. Month 


Month Shipped (tons) Month Shipped (tons) 
1 20 11 19 
2 15 12 25 
3 10 13 32 
4 18 14 26 
5 28 15 21 
6 18 16 29 
7 13 17 35 
8 21 18 28 
9 28 19 22 
10 22 20 32 


First, we find the correlation coefficient. 
| > with(plots) : 
with( Statistics) : 
| with(PSMv2) : 
| > Month := [$1..20)] : 


[> Correlation( Month, Tons); 
0.672564359308119 





Tons := [20, 15, 10, 18, 28, 18, 13, 21, 28, 22, 19, 25, 32, 26, 21, 29, 35, 28, 22, 32] : 
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According to our rules of thumb, 0.67 is a moderate to strong value for linear 
correlation. So is the model to use linear? Plot the data, looking for trends and 


patterns. Figure 6.4a shows the data as a scatterplot, while 6.4b “connects the 
dots.” 
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FIGURE 6.4: Shipping Data Graphs 


Although linear regression can be used here, it will not capture the seasonal 
trends. There appears to be an oscillating pattern with a linear upward trend. 
For purposes of comparison, find a linear model. 
| > Fit(a + b+ x, Month, Tons, x, summarize = embed); 


15.1263157894737 + 0.759398496240602 x 











Summary 
Model: 15.126316 + 0.75939850 x 
Coefficients | Estimate | Standard Error | t-value P(>|t}) 
a 15.1263 2.35928 6.41140 | 4.90702 - 1078 
0.759398 0.196949 3.85581 | 0.00115802 

















R-squared: 0.452343 
Adjusted R-squared: 0.421917 





> Residuals 


The R? value of 0.45 does not indicate a strong fit of the data as we 
expected. Since we need to represent oscillations with a slight linear upward 
trend, we’ll try a sine model with a linear component 


f(a) = ao + aig + ag sin(a3x + a4). 
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As noted before, good estimates of the parameters a; are necessary for obtain- 
ing a good fit. Check Maple’s fit with default parameter values! Use the linear 


fit from above for estimating ap and a1; use your knowledge of trigonometry 
to estimate the other parameters. 


ag = 15, ay = 0.8, ag = 6, a3 = 1.6, a4 = 1. 
Use NonlinearRegression from the PSMv2 package. 


| > sinemodel := a, + a,- £ + a2: sin(ag z + aa): 
| paramestimates := [ao = 15, a, = 0.8, ag = 6,a3 = 1.6,a4 = 1.0]: 
| > NonlinearRegression( Month, Tons, sinemodel, paramestimates); 


Coefficient Standard Error T-Statistic P-value 


ao 14.187 0.56742 25.002 0. 
a, 0.84795 0.047473 17.862 0. 
az 6.6892 0.38351 17.442 0. 
a3 1.5735 0.010024 156.98 0. 
a4 0.082625 0.12415 0.66555 0.51580 





Model: m(x) = 14.187 + 0.84795 x + 6.6892 sin(1.5735 x + 0.082625) 


The coefficient’s p-values look very good, save the phase shift a4. Plot the 
model with the data. 


[> m := rhs(%) : 
|> display(plot(m, x = 0..21, thickness = 2), 
pointplot( Pts, symbol = solidcircle, symbolsize = 14), 


labels = [month, ‘Tons Shipped‘], 
labeldirections = [horizontal, vertical]) : 
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This model captures the oscillations and upward trend nicely. The sum of 
squared error is only SSE = 21.8. The new SSE is quite a bit smaller than 
that of the linear model. Clearly the model based on sine+linear regression 
does a much better job in predicting the trends than just using a simple linear 
regression. 


Example 6.5. Modeling Casualties in Afghanistan. 

In a January 2010 news report, General Barry McCaffrey, USA, Retired, stated 
that the situation in Afghanistan would be getting much worse. General 
McCaffrey claimed casualties would double over the next year. The problem 
is to analyze the data to determine whether it supports his assertion. 


The data that Gen. McCaffrey used for his analysis was the available 2001- 
2009 figures shown in Table 6.5. The table also shows casualties for 2010 and 
part of 2011 that were not available at the time. 


TABLE 6.5: Casualties in Afghanistan by Month 


Month | 2001 2002 2003 2004 2005 2006 2007 2008 2009 | 2010 2011 





1 12 10 25 6 7 21 19 88 199 308 
2 13 9 17 5 17 39 18 52 247 245 
3 5 14 12 16 7 26 53 78 346 345 
4 8 13 11 29 13 6l 37 60 307 411 
5 2 8 31 34 39 8&7 117 156 | 443 
6 3 34 60 68 100 167 213 | 583 
7 6 10 25 38 59 100 151 394 | 667 
8 2 13 22 72 56 103 167 493 | 631 
9 5 19 34 47 70 88 122 390 | 674 
10 5 8 5 38 27 68 131 90 348 | 631 
11 10 4 27 18 12 51 75 37 214 | 605 
12 28 6 12 9 20 23 46 50 168 | 359 








First, do a quick “reasonability model.” Sum the numbers across the years 
that Gen. McCaffrey had data for (Table 6.6) and graph a scatterplot. 


TABLE 6.6: Casualties in Afghanistan by Year 


2002 2003 2004 2005 2006 2007 2008 2009 
122 144 276 366 478 877 1028 2649 








6See www.newser.com/story/77563/general-brace-for-thousands-of-gi-casualties.html. 


254 Regression Models 


The scatterplot’s shape suggests that we use a parabola as our “reason- 
ability model.” 


| > with(plots) : 
| with( Statistics) : 
[> Cas Yr := [122, 144, 276, 366,478, 877, 1028, 2649] : 

Yr := [$1..8]; 
| Pts := zip((x,y) > [x,y], Yr, CasYr) : 
[> Fit(a +b: x+ c- x?, Yr, CasYr, x) : 

RM := unapply(evalf(%, 5), £); 

RM(9); 

RM := x = 606.39 — 404.542 + 76.726x? 
3180.336 


The model’s prediction, while much smaller than the actual 2010 value, is not 
a doubling. However, the model does suggest further analysis is required. 

We will focus on the four years before 2010, that is 2006 to 2009, and ask, 
do we expect the casualties in Afghanistan to double over the next year, 2010, 
based on those casualty figures? 





[> Cas := |7, 17,7, 13, 39, 68, 59, 56, 70, 68, 51, 23, 21, 39, 26, 61, 87, 100, 100, 
103, 88, 131, 75, 46, 19, 18, 53, 37, 117, 167, 151, 167, 122, 90, 37, 50, 83, 52, 
78, 60, 156, 213, 394, 493, 390, 348, 214, 168] : 
Mon := {$1..nops( Cas))] : 
Pts := zip((x, y) > [x,y], Mon, Cas) : 





In the same fashion as before, plot both a scatterplot and a line plot of the 
data available to Gen. McCaffrey over that period. See Figure 6.5. The line 
plot may better show trends in the data, such as an upward tendency or 
oscillations that are not apparent in the scatterplot. However, a line plot can 
be very difficult to read or interpret when there are a large number of data 
points connected. Good graphing is always a balancing act. After modeling 
the data from 2006 to 2009, we can use the 2010 values to test our model 
for goodness of prediction. There are two trends apparent from the graphs. 
First, the data oscillates seasonally. This time, however, the oscillations grow 
in magnitude. We will try to capture that with an x - sin(x) term. Second, 
the data appears to have an overall upward trend. We will attempt to capture 
that feature with a linear component. The nonlinear model we choose is 


m(x) = ao + a1 : £ + a2: x - sin(ag : x + aas). 


Using the techniques described in the previous example, we fit the 
nonlinear model: a growing-amplitude sine plus a linear trend. We estimate 
the parameters from the scatterplot: 








[a9 = 10, ay = 2, a2 = 47, a3 = 0.5, ag = 0.3]. 
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FIGURE 6.5: Afghanistan Casualties Graphs 


Use our NonlinearRegression program. 


[> model := ag + a1 - x + a2 + x - sin(ag : x + as) : 

| paramest := [ap = 10, ay = 2, a2 = 47, a3 = 0.5, a4 = 0.3] : 
[> NonlinearRegression( Mon, Cas, model, paramest) : 

m := unapply(rhs(%), x); 








Coefficient Standard Error T-Statistic P-value 
ao  —T7.1709 14.589 —0.49152 0.62556 
ay 4.3320 0.52772 8.2089 0.0 
a2 —3.4228 0.37369 —9.1596 0.0 
az 0.50094 0.010784 46.454 0.0 
a4 1.4436 0.41046 3.5171 0.0010433 





L m := x > —7.1709 + 4.3320 x — 3.4228 x sin(0.50094 x + 1.4436) 


The p=values for all the parameters, except the constant term, are quite good. 

Plot a graph to see the model capturing the oscillations and linear growth 

fairly well. Does the model also show the increase in amplitude as well? 
Considering the residuals will be our next diagnostic. 





[> Residuals := |seq(|Mon;, Casi — m(Mon;)], i = 1..nops(Cas))] : 


Now graph the residuals looking for patterns and warning signs. 
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| > pointplot( Residuals, symbol = solidcircle, symbolsize = 14); 
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The residual plot shows no clear pattern suggesting the model appears to be 
adequate. Although we note that the model did not “keep up” with the change 
in amplitude of the oscillations. 

What does the model predict for 2010 in relation to 2009? 


[> Yr2009 := sum( Casp, k = 37..48); 
Yr2010 := sum(m(k), k = 49..60); 
¥r2010 


Yr2009° 
Yr2009 := 2649 


Yr2010 := 2869.751675 
L 1.083333966 


This model does not show a doubling effect from year four. Thus, the model 
does not support General McCaffrey’s hypothesis. 

Consider the ratios of casualties for each month of 2009 to 2008 and then 
2010 to 2009. How would this information affect your conclusions? 





Logistic Regression and Poisson Regression 


Often the dependent variable has special characteristics. Here we examine two 
notable cases: (a) logistic regression, also known as a logit model, where the 
dependent variable is binary, and (b) Poisson regression where the dependent 
variable measures integer counts that follow a Poisson distribution. 
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One-Predictor Logistic Regression 


We begin with three one-predictor logistic regression model examples in which 
the dependent variable is binary, i.e., {0, 1}. The logistic regression model form 


that we will use is 
ebotbhiz 


f(z) = 


The logistic function, approximating a unit step function, gave the name logis- 
tic regression. The most general form handles dependent variables with a finite 
number of states. 


1 + ebothia® 


Example 6.6. Damages versus Flight Time. 
After a number of hours of flight time, equipment is either damaged or not. 
Let the dependent variable y be a binary variable with 


_ J1 there is damage 
© \O there is no damage ’ 


and let ¢ be the flight time in hours. 
Over a reporting period, the data of Table 6.7 has been collected. 


TABLE 6.7: Damage vs. Flight Time 


t142439621673253 3 8 
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Calculate a logistic regression for damage. 
[> with( Statistics) : 


[> t:= [4,2,4,3,9,6,2,11,6,7,3,2,5,3,3,8, 
6, 





10,5, 13, 7,3, 4,2,3,2, 5,6,6,3, 4, 10] : 
y := [1,1,0,1,0,0,0,0, 1,0,1, 1,0,0,0,0, 
0,1,0,0,1,0,1,1,0,0,0, 1,0,1,0] : 
Pts := zip((u,v) > [u,v], t, y) : 
> LM := expla +b- x) . 
| 1 +exp(a +b- x) 
Now, the fit. 


|> NonlinearFit(LM t, y, x, initialvalues = [a = 1.5,b = —0.5]) : 


Damage := unapply(evalf(%, 5), x); 
—0.39190 «41.4432 








Damage := £ > 1. + e~ 0-39190 +1.4432 
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| > plot(Damage, 0..1 + maz(t), thickness = 2); 


0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 


0.1 











The analyst must decide over what intervals of x we call the y probability a 
1 or a 0 using the logistic S-curve shown from the fit. 
We switch from times to time differentials in the next example. 


Example 6.7. Damages vs. Time Differentials. 
Replace the times in the previous example with time differentials given in 
Table 6.8. 


TABLE 6.8: Damage vs. Time Differentials (T.D) 





TD|192 241 -71 39 45 106 -3 162 
y | I 1 0 1 0 0 0 0 





TD | 72.8 28.7 11.5 56.3 —0.5 —1.3 12.9 34.1 
y | I 0 0 I 0 0 I I 





TD] 66 -25 242 23 369 -11.7 21 104 
y | 0 0 0 0 1 0 I I 


TD] 91 2 126 18 15 27.3 =A 
y | o0 0 0 1 0 I 0 
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Repeat the procedure of the previous example. 


[> TD := [19.2, 24.1, —7.1,3.9, 4.5, 10.6, —3, 16.2, 72.8, 28.7, 11.5, 56.3, —0.5, 
—1.3, 12.9, 34.1, 6.6, —2.5, 24.2, 2.3, 36.9, —11.7,2.1, 10.4, 9-1, 2, 12.6, 18, 
(o 1.5,27.3,-8.4] : 
|> NonlinearFit(LM, TD, y, x, initialvalues = [a = 0.5,b = 1]) : 
TDDamage := unapply(evalf (%, 5), x); 
0-054553 2—1.1997 
TDDamage := «+> 





L 1. + €0-054553 21.1997 
| > plot(TDDamage, min(T D) — 1..1 + maz(T D), thickness = 2); 














Once again, the analyst must decide over what intervals of x we call the y 
probability a 1 or a 0 using the logistic S-curve shown above. 

Dehumanization is not a new phenomenon in human conflict. Societies 
have dehumanized their adversaries since the beginnings of civilization in order 
to allow them to seize, coerce, maim, or ultimately to kill while avoiding the 
pain of conscience for committing these extreme, violent actions. By taking 
away the human traits of these opponents, adversaries are made to be objects 
deserving of wrath and meriting the violence as justice.” Dehumanization 
still occurs today in both developed and underdeveloped societies. The next 
example analyzes the impact that dehumanization has in its various forms on 
the outcome of a state’s ability to win a conflict. 


Example 6.8. Conflict and Dehumanization. 
To examine dehumanization as a quantitative statistic, we combine a data 
set of 25 conflicts from Erik Melander, Magnus Oberg, and Jonathan Hall’s 





"See David L. Smith, Less Than Human: Why We Demean, Enslave, and Exterminate 
Others. 
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“Uppsala Peace and Conflict,” (Table 1, pg. 25)° with Joakim Kreutz’s “How 
and When Armed Conflicts End: Introducing the UCDP Conflict Termination 
Dataset”? to have a designated binary “win-lose” assessment for each conflict. 
We will use civilian casualties as a proxy indicator of the degree of dehuman- 
ization during the conflict. The conflicts in Table 6.9 run the gamut from 
high- to low-intensity in the spectrum, and include both inter- and intra-state 
hostilities. Therefore, the data is a reasonably general representation. 


TABLE 6.9: Top 25 Worst Conflicts Estimated by War-Related Deaths 








; i Bide A: Civilian Military Percentage 
Year Side A Side B Win=1 Civilian 
Lose= 0 (1,0098). (2,0008) Deaths 
1946-48 India CPI 1 800 0 100.0 
1949-62 Columbia Mil. Junta 1 200 100 66.67 
1950-51 China Taiwan 1 1,000 $ 100.0 
1950-53 Korea South Korea 0 1,000 1,889 34.60 
1954-62 Algeria/France FLN 0 82 18 82.00 
1956-59 China Tibet 1 60 40 60.00 
1956-65 Rwanda/Tutsi Hutu 0 102 3 97.14 
1961-70 Iraq KDP 1 100 5 95.24 
1963-72 Sudan Anya Nya 1 250 250 50.00 
1965-66 Indonesia OPM 1 500 F 100.0 
1965-75 N. Vietnam S. Vietnam 1 1,000 1,058 48.59 
1966-87 Guatemala FAR 1 100 38 72.46 
1967-70 Nigeria Rep. Biafra 1 1,000 1,000 50.00 
1967-70 Egypt Israel 0 50 25 66.67 
1971-71 Bangladesh JSS/SB 1 1,000 500 66.67 
1971-78 Uganda Military Fact. 1 300 0 100.0 
1972-72 Burundi Military Fact. 1 80 20 80.00 
1974-87 Ethiopia OLF 1 500 46 91.58 
1975-90 Lebanon LNM 1 76 25 75.25 
1975-78 Cambodia Khmer Rouge 0 1,500 500 75.00 
1975-87 Angola FNLA 1 200 13 93.90 
1978-87 Afghanistan USSR 1 50 50 50.00 
1979-87 El Salvador FMLN 1 50 15 76.92 
1981-87 Uganda Kikosi Maalum 1 100 2 98.04 
1981-87 Mozambique Renamo 1 350 51 87.28 


ORD 





denotes missing values. 


8E. Melander, M. Oberg, and J. Hall, “The ‘New Wars’ Debate Revisited: An Empiri- 
cal Evaluation of the Atrociousness of ‘New Wars’,” Uppsala Univ. Press, Uppsala, 2006. 
Available at www.pcr.uu.se/digitalAssets/654/c_654444-]_1-k_uprp_no_9.pdf. 
9J. Kreutz, “How and When Armed Conflicts End: Introducing the UCDP Conflict 
Termination Dataset,” J. Peace Research, 47(2), 2010, 243-250. 
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By including the ratio of civilian casualties to total casualties in Table 6.9, 
we are able to determine what percentage of casualties in each conflict is 
civilian. This ratio provides a quantifiable variable to analyze. 

Binary logistic regression analysis is the first method to choose to analyze 
the interrelation of dehumanization’s effects (shown by proxy through higher 
percentages of civilian casualties) on the outcome of conflict as a win (1) or 
a loss (0). This type of regression model will allow us to infer whether or 
not the independent variable, civilian casualties percentage, has a statistically 
significant impact on the conflict’s outcome, win or lose. Using the data from 
Table 6.9, we assign the civilian casualty percentages to be the independent 
variable and Side A’s win/loss outcome of the conflict to be the binary depen- 
dent variable, then develop a binary logistic regression model. Use Maple to 
derive the logistic regression statistics from the model as follows. 


| > with( Statistics) : 


[> CivilianCasualtyPercent := 0.01 - [100, 66.67, 100, 34.6, 82, 60, 97.14, 
95.24, 50, 100, 48.59, 72.46, 50, 66.67, 66.67, 100, 80, 91.58, 75.25, 75, 93.9, 

L 50, 76.92, 98.04, 87.28] : 

[> WinLose := [1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,1,1,1,1,0,1,1,1,1,1]: 

i expla- a+b) 


(c= : 
> miade 1 +expla- x+ b) 








We derive estimates of the parameters from the data. (See, e.g., Bauldry 
[B1997] for simple methods.) Take a = —1.9 and b = 0.05 initially. 


|> NonlinearFit(model, CivilianCasualtyPercent, WinLose, t, initialvalues 
= |a = —1.9, b = 0.05]); 
L 0.800002645508260 


This result does not pass the common sense test. Ask Maple for more infor- 
mation by increasing infolevel. 





| > infolevel[Statistics] := 2: 

| infolevel|Optimization| := 1: 

| > NonlinearFit( model, CivilianCasualtyPercent, WinLose, t, initialvalues 
= [a = —1.9, b = 0.05); 

In NonlinearFit (algebraic form) 

LSSolve: calling nonlinear LS solver 

LSSolve: using method=modifiednewton 

LSSolve: number of problem variables 3 

LSSolve: number of residuals 25 

attemptsolution: conditions for a minimum are not all satisfied, but 

a better point could not be found 


0.800002645508260 
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Maple’s NonlinearFit could not optimize the regression. Let’s try our 
NonlinearRegression. 


[> with(PSMv2) : 


| > NonlinearRegression( CivilianCasualtyPercent, WinLose, model, 
[a = —1.9, b = 0.05)); 
In Fit 
In NonlinearFit (algebraic form) 
LSSolve: calling nonlinear LS solver 
LSSolve: using method=modifiednewton 
LSSolve: number of problem variables 2 
LSSolve: number of residuals 25 
attemptsolution: conditions for a minimum are not all satisfied, but 


a better point could not be found 


Coefficient Standard Error T-Statistic P-value 
a 1.9305 2.6384 0.73170 0.47174 
b —0.044361 1.8723 —0.023693 0.98130 


e}-9305 x—0.044361 


Model: m(x) = 7 F el 9305 2—0.044361 








This logistic model result appears much better at first look. However, the 
coefficients’ P-values tell us to have no confidence in the model. Graph the 
model with the data! 

Analysis Interpretation: The conclusion from our analysis is that the civilian 
casualty percentages are not significantly correlated with whether the conflict 
leads to a win or a loss for Side A. Therefore, from this initial study, we can 
loosely conclude that dehumanization does not have a significant effect on the 
outcome of a state’s ability to win or lose a conflict. Further investigation will 
be necessary. 


One-Predictor Poisson Regression 


According to Devore [D2012], the simple linear regression model is defined by: 


There exists parameters 89, 61, and o?, such that for any fixed input 
value of x, the dependent variable is a random variable related to x 
through the model equation Y = bo + (ix + e. The quantity £ in 
the model equation is the “error”—a random variable assumed to be 
normally distributed with mean 0 and variance o?. 


We expand this definition to when the response variable y is assumed to 
have a normal distribution with mean uy and variance o°. We found that 
the mean could be modeled as a function of our multiple predictor variables, 
£1, Z2,..., En, using the linear function Y = bo + bızı + Borg +--+ Bere. 
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The key assumptions for least squares are 
e the relationship between dependent and independent variables is linear, 
e errors are independent and normally distributed, and 
e homoscedasticity!? of the errors. 


If any assumption is not satisfied, the model’s adequacy is questioned. In first 
courses, patterns seen or not seen in residual plots are used to gain informa- 
tion about a model’s adequacy. (See [AA1979], [D2012]). 


Normality Assumption Lost 

In logistic and Poisson regression, the response variable’s probability lies 
between 0 and 1. According to Neter [NKNW1996], this constraint loses both 
the normality and the constant variance assumptions listed above. Without 
these assumptions, the F and t tests cannot be used for analyzing the regres- 
sion model. When this happens, transform the model and the data with a 
logistic transformation of the probability p, called logit p, to map the interval 
[0, 1] to (—co, +00), eliminating the 0-1 constraint: 





n( =) = Bo + Bia1 + Bowe + +- + Brn 
l—p 
The Øs can now be interpreted as increasing or decreasing the “log odds” of 
an event, and exp(3) (the “odds multiplier”) can be used as the odds ratio for 
a unit increase or decrease in the associated explanatory variable. 

When the response variable is in the form of a count, we face a yet different 
constraint. Counts are all positive integers corresponding to rare events. Thus, 
a Poisson distribution (rather than a normal distribution) is more appropriate 
since the Poisson has a mean greater than 0, and the counts are all positive 
integers. Recall that the Poisson distribution gives the probability of y events 
occurring in time period t as 


P(y; u) = oe ae 


Then the logarithm of the response variable is linked to a linear function of 
explanatory variables. 


In(Y) = bo + Bia1 + Bot2 +--+ + Brin 





Thus 
Y = () (9%) e, 


In other words, a Poisson regression model expresses the “log outcome rate” 
as a linear function of the predictors, sometimes called “exposure variables.” 





10Homoscedasticity: All random variables have the same finite variance. 
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Assumptions in Poisson Regression 

There are several key assumptions in Poisson regression that are different from 
those in the simple linear regression model. These assumptions include that 
the logarithm of the dependent variable changes linearly with equal incre- 
mental increases in the exposure variable; i.e., the relationship between the 
logarithm of the dependent variable and the independent variables is linear. 
For example, if we measure risk in exposure per unit time with one group 
as counts per month, while another is counts per years, we can convert all 
exposures to strictly counts. We find that changes in the rate from combined 
effects of different exposures are multiplicative; i.e., changes in the log of the 
rate from combined effects of different exposures are additive. We find for each 
level of the covariates, the number of cases has variance equal to the mean, 
making it follow a Poisson distribution. Further, we assume the observations 
are independent. 

Here, too, we use diagnostic methods to identify violations of the assump- 
tions. To determine whether variances are too large or too small, plot residuals 
versus the mean at different levels of the predictor variables. Recall that in 
simple linear regression, one diagnostic of the model used plots of residuals 
against fits (fitted values). We will look for patterns in the residual or devia- 
tion plots as our main diagnostic tool for Poisson regression. 


Poisson Regression Model 
The basic model for Poisson regression is 


Y; = E|Y;] +e; fori=1,2,...,n 


The ith case mean response is denoted by u;, where u; can be one of many 
defined functions (Neter [NKNW1996]). We will only use the form 


uj = u(x;,b) = exp(x7 b) where u; > 0. 


We assume that the Y; are independent Poisson random variables with 
expected value u;. 

In order to apply regression techniques, we will use the likelihood function 
L (see [AA1979, D2012]) given by 


n n 


9 Y: *eXp(—-uUul Xj, 





Maximizing this function is intrinsically quite difficult. Instead, maximize the 
logarithm of the likelihood function shown below. 


In(L) = X ¥;In(us) -X ui — Sn!) (6.4) 
i=1 i=1 i=1 
Numerical techniques are used to maximize In(L) to obtain the best estimates 
for the coefficients of the model. Often, “good” starting points are required to 
obtain convergence to the maximum ([Fox2012]). 
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The deviations or residuals will be used to analyze the model. In Poisson 
regression, the deviance is given by 


xy; n(2) -Er -u (6.5) 


i=1 $ i=1 


Dev = 2 





where u; is the fitted model; whenever Y; = 0, we set Y; - In(Y;/u;) = 0. 
Diagnostic testing of the coefficients is carried out in the same fashion 
as for logistic regression. To estimate the variance-covariance matrix, use the 
Hessian matrix H(X), the matrix of second partial derivatives of the log- 
likelihood function In(Z) of (6.4). Then the approximated variance-covariance 
matrix is VC(X,B) = —H(X)~! evaluated at B, the final estimates of the 
coefficients. The main diagonal elements of VC are estimates for the variance; 
the estimated standard deviations seg are the square roots of the main diag- 
onal elements. Then perform hypothesis tests on the coefficients using t-tests. 
Two examples using the Hessian follow. 


Example 6.9. Hessian-based Modeling. 
Consider the model y; = exp(bo + bix) for i = 1, 2,..., n. 


Put this model into (6.4) to obtain 


Il 
M= 


ln(L) Yi: In( exp(bo + b12;)) — 5 exp(bo + bixi) — 5 yi! 
i=1 


i=l 


> 
Il 
ji 


Il 
Me 


Yi: (bo + bizi) =~ Me exp(bo + bixi) = 5 yi! 
{=l 


i=1 


> 
Il 
m 


The Hessian H = [h;;] comes from 


07 In(L 
ij = 0b, for all i and J, 
which gives the estimate of the variance-covariance matrix VC = —Hpy- 


For the two-parameter model (bo and b1), the Hessian is 
n n 
“Su -Son 
i=1 i=1 
= n n 
-Sen -Eain 
i=1 i=1 


H(X) 


Change the model slightly adding a second independent variable with a third 
parameter. The model becomes y; = exp(bo +0121; +b2%2;) for i = 1, 2, ..., n. 
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Compute the new Hessian and carefully note the similarities. 


n n 
So yi Xoru Yi X ta Yi 
i i=1 i=1 
n n n 
A(X) =-— X ru Yi Dori Yi XO tuaz Yi 
i=1 i=1 i=1 


n n n 
2 
X T2i Yi X LiL Yi 5 Tə; Yi 
i=1 i=1 i=1 


The pattern in the matrix is easily extended to obtain the Hessian for a model 
with n independent variables. 

Let y; = exp(bo + bizi + b2£2i +--+ + bn£ni). The general Poisson model 
Hessian is 


n n n n 
i=1 i=1 i=1 i=l 
n n N n 
2 
X Tii Yi 5 Tii Yi X Tiiti Yi °° X Tiikni Yi 
i=1 i=1 i=1 i=1 
n n N n 
2 
H(X)=- |S cay X tusuy Sothys o Do waitniyi 
i=l i=l i=1 i=1 


n n n n 
2 
i=l i=1 i=1 i=1 


Replace the formulas with numerical values from the data. The resulting sym- 
metric square matrix should be non-singular. Compute the inverse of the neg- 
ative of the Hessian matrix to find the variance-covariance matrix VC. The 
main diagonal entries of VC are the (approximate) variances of the estimated 
coefficients b;. The square roots of the entries on the main diagonal are the 
estimates of se(b;), the standard error for b;, to be used in the hypothesis 
testing with t* = b;/se(b;). 

We now have all the information we need to build the tables for a Poisson 
regression that are similar to a regression program’s output. 


Estimating the Regression Coefficients: Summary 
The number of predictor variables plus one (for the constant term) gives the 
number of coefficients in the model y; = exp(bo + 6141; + b2£2i ++ +++ bnTni). 
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Estimates of the b; are the final values from the numerical search method (if 
it converged) used to maximize the log-likelihood function In(L) of (6.4). The 
values of se(b;), the standard error estimate for b;, are the square roots of 
the main diagonal of the variance-covariance matrix VC = —H(X)p 1. The 
values of t* = b;/se(b;) and the p-value, the probability P(T > |t*|). In the 
summary table of Poisson regression analysis below, let m be the number of 
variables in the model, and let k be the number of data elements of y, the 
dependent variable. A summary appears in Table 6.10. 


TABLE 6.10: Poisson Regression Variables Summary 











Degrees of Mean 
Freedom Deviance Deviance Ratio 
(df) (MDev) 
Regression m Dreg = MDev(reg) | MDev(reg)| 
D; Dres E Dreg/M 
Residual k—1—m Dres = result MDev(reg) 
from the full =; Dreg- 
model with m 
predictors 
Total k-1 D, = result MDev(t) 
from reduced = Preg 


model y = e 


Note that a prerequisite for using Poisson regression is that the dependent 
variable Y must be discrete counts with large numbers being a rare event. 

We have chosen two data sets that have published solutions to be our basic 
examples. First, an outline of the procedure: 


STEP 0. Enter the data for X and Y. 
STEP 1. For Y: 


(a) generate a histogram, and 


(b) perform a chi-squared goodness-of-fit test for a Poisson distribution. 1! 


If Y follows a Poisson distribution, then continue. If Y is “count data,” use 
Poisson regression regardless of the chi-squared test. 


STEP 2. Compute the value of bo in the constant model y = exp(bo) that 
minimizes (6.5); i.e., minimize two times the deviations. 





11See, e.g., stattrek.com/chi-square-test /goodness-of-fit.aspx for an introduction. 
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STEP 3. Compute the values of bp and bı in the model y = exp(bo +612) that 
minimize the deviation (6.5). 


STEP 4. Interpret the results and the odds ratio. 


We’ll step through an example following the outline above. 


Example 6.10. Hospital Surgeries. 
A group of hospitals has collected data on the numbers of Caesarean surgeries 
vs. the total number of births (see Table 6.11).1” 


TABLE 6.11: Total Births vs. Caesarean Surgeries 


Total | 3246 2750 2507 2371 1904 1501 1272 1080 1027 970 
Special | 26 24 21 21 21 20 19 18 18 17 





Total | 739 679 502 236 357 309 192 138 100 95 
Special | 17 16 16 16 16 15 14 14 13 13 








Use the hospitals’ data set to perform a Poisson regression following the 
steps listed above. 


STEP 0. Enter the data. 


[> zhe := [3246, 2750, 2507, 2371, 1904, 1501, 1272, 1080, 1027, 970, 739, 
679, 502, 236, 357, 309, 192, 138, 100, 95] : 
yhe := [26, 24, 21,21, 21,20, 19, 18, 18, 17,17, 16, 16, 16, 16, 15, 14, 14, 
13,13] : 
N := nops(yhc); 





N := 20 


STEP 1. Plot a histogram, and then perform a Chi-square Goodness-of-fit test 
on yhc, if appropriate. 

(Note: Maple’s Histogram function is in the Statistics package. There are a 
large number of options for binning the data; we will use frequencyscale = 
absolute to have the heights of the bars equal to the frequency of entries in 
the associated bin. Collect the bin counts with TallyInto.) 


[> with( Statistics) : 

L Dirac(0.) := 1.0: # needed for the Poisson distribution 

[> Bins := [$(min(yhc)..maz(yhc))]; 

Bins := [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26] 








12 Adapted from “Research Methods II: Multivariate Analysis,” J. Trop. Pediatrics, 
Online Feature, (2009), pp. 136-143. Originally at: www.oxfordjournals.org/our_journals/ 
tropej/online/ma_chap13.pdf. 
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| > Histogram(yhe, frequencyscale = absolute); 
4 
3 
2 
1 
0 7 T 7 T 7 T T 
14 16 18 20 22 24 26 
[> TallyInto(yhc, default, bins = maz(yhc) — min(yhc) + 1) : 
ObsFreq := rhs ~ (%); 
ObsFreq := [2,2,1,4, 2,2, 1,1,3, 0,0, 1,0, 1] 





Now for the chi-squared test. First, generate the predicted values from an 
estimated Poisson distribution. 

[> Aest := Mean(yhc); 

L Aest := 17.7500000000000 

[> P := Distribution(Poisson(Xest)) : 


|> PredFreq := [seq(N - PDF(P,t),t in Bins)]; 
PredFreq := [1.090438738, 1.382520543, 1.635982643, 1.814918244, 


1.894988167, 1.868668887, 1.745730144, 1.549335503, 1.309557389, 
1.056574712, 0.8154000494, 0.6030562866, 0.4281699634, 0.2923083404] 


We are ready to use Maple’s chi-squared test, ChiSquareGoodnessOfFitTest, 
with a significance level of 0.05. Use the summarize = embed option, as it 
produces the most readable output. The command is terminated with a colon: 
“embedding the output” makes it unnecessary to return a result. 
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[> ChiSquareGoodnessOfFitTest(ObsFreq, PredFreq, level = 0.05, 


summarize = embed) : 


Chi-Square Test for Goodness-of-Fit 





Null Hypothesis: Observed sample does not differ from 
expected sample 





Alternative Hypothesis: | Observed sample differs from expected 
































sample 
Categories | Distribution Computed Computed | Critical 
Statistic p-value Value 
14. | ChiSquare(13) | 10.8977 | 0.619387 | 22.3620 
Result: Accepted: This statistical test does not 


provide enough evidence to conclude that 
the null hypothesis is false. 














The chi-squared test indicates that a Poisson distribution is reasonable. 


STEP 2. Find the best constant model y = exp(bo). 
Let’s use Maple’s LinearFit on the function Y = In(y) = bo. 


[> LinearFit(bo, zhc, In ~ (yhc), x, summarize = embed) : 
Summary 
Model: 2.8581739 
Coefficients | Estimate | Standard Error | t-value | P(>|t|) 
Parameter 1 | 2.85817 0.0434058 65.8478 0. 





























R-squared: —1.77636 10715 
Adjusted R-squared: —1.77636 10715 
Vv Residuals 
































Residual Sum Residual esc Degrees of 
Standard 
of Squares Mean Square Freedom 
Error 
0.715944 0.0376813 0.194117 19 
Five Point Summary 
Mini First Medi Third Mieri 
inimum Quere edian Grea aximum 
0.293225 | —0.123233 | —0.0249605 | 0.166019 0.399923 
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STEP 3. Find the best exponential model y = exp(bo + biz). 
Let’s use Maple’s ExponentialFit to find the model. 




























































































[> ExponentialFit(xhce, yhc, x, summarize = embed) : 
Summary 
Model: 14.136875 e0-00019056858x 
Coefficients | Estimate | Standard Error | t-value P(>ļ|t]) 
Parameter 1 2.64879 0.0192928 137.294 0. 
Parameter 2 | 0.000190569 | 0.0000132697 | 14.3612 | 2.66 1071! 
R-squared: 0.999650 
Adjusted R-squared: 0.999611 
Vv Residuals 
; j Residual 
Residual Sum Residual Degrees of 
; Standard 
of Squares Mean Square Freedom 
Error 
0.0574687 0.00319270 0.0565040 18 
Five Point Summary 
Mini First Medi Third Maxi 
inimum Omanike edian Ouerifike aximum 
—0.102894 | —0.0420307 | 0.00279072 | 0.0449234 | 0.0788279 


STEP 4. Conclude by calculating the odds-ratio. 
Use the odds-multiplier exp(61) as the approximate odds-ratio, often called 
risk-ratio for Poisson regression. 


[> OR := exp(0.000190569); 
OR := 1.000190587 





OR represents the potential increase resulting from one unit increase in x. 
(How does this concept relate to “opportunity cost” in linear programming 
and “marginal revenue” in economics?) 


Return to the Philippines example relating literacy and violence described 
in the opening of this chapter. 


Example 6.11. Violence in the Philippines. 
The number of significant acts of violence, SigActs in Table 6.12, are integer 
counts. 13 





13Data sources: National Statistics Office (Manila, Philipppines) and the Archives of the 
Armed Forces of the Philippines. 
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TABLE 6.12: Literacy Rate (Lit) vs. Significant Acts of Violence (SigActs), 
Philippines, 2008. 





Province Lit SigActs Province Lit SigActs 
Basnlan 71.6 29 Drnagat Istands 85.7 0 
Larseao del Sur 71.6 30 Sungapdel Norte 85.7 10 
Maguindanso 71.6 122 Sungapdel Sur 85.7 31 
Suu 71.6 26 Bukidnon 85.9 14 
Tawi-Tawi 71.6 1 Camigum 85.9 0 
Bihran 72.9 0 Laraodel Norte 85.9 57 
Eastern Samar 72.9 11 Misamis Occidental 85.9 8 
Leyte 72.9 2 Misamis Onental 85.9 7 
Northern Samar 72.9 23 Batanes 86.1 
Southern Leyte 72.9 0 Cagayan 86.1 15 
Western Samar 72.9 64 Isabela 86.1 4 
North Cotabato 78.3 125 Nueva Vizcaya 86.1 3 
Sarangani 78.3 23 Quirmo 86.1 0 
South Cotabato 78.3 5 Bokal 86.6 2 
Suan Ku:iarat 78.3 18 Cebu 86.6 0 
Zamboanga del Norte 79.6 8 Negros Onertal 86.6 27 
Zamboarga del Sur 79.6 10 Siquyjor 86.6 0 
Zamboanga Sibugay 79.6 3 Abra 89.2 11 
Albey 79.9 35 Apayap 89.2 0 
Camarines Norte 79.9 12 Benguet 89.2 0 
Camarines Sur 79.9 44 Ifugao 89.2 
Caanduancs 79.9 9 Kahinga 89.2 11 
Masbate 79.9 42 Mountain Province 89.2 0 
Sorsogon 79.9 52 Veces Norte 91.3 0 
Compostela Valtey 81.7 126 Lvees Sur 91.3 2 
Dawvaodel Norte 81.7 35 La Unon 91.3 0 
Davaedel Sur 81.7 64 Pangasman 91.3 0 
Davao Orental 81.7 40 Aurora 92.1 10 
Aklan 82.6 0 Bataan 92.1 1 
Artque 82.6 1 Bulacan 92.1 6 
Capuz 82.6 8 Nueva Ecya 92.1 4 
Guimaras 82.6 0 Pampenga 92.1 3 
lloilo 82.6 8 Tarlac 92.1 4 
Negros Occidental 82.6 26 Zambales 92.1 6 
Marinduque 83.9 0 Batangas 93.5 5 
Occedemta Mindoro 83.9 5 Cavric 93.5 0 
Onental Mindoro 83.9 7 Laguna 93.5 4 
Palawan 83.9 2 Quezon 93.5 28 
Romblon 83.9 0 Rizal 93.5 3 
Agusandel Norte 85.7 13 Metropolzian Manila 94 1 


Aguxandel Sur 85.7 33 
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The literacy data has been defined as L, the SigActs as V. Examine the 
histogram in Figure 6.6 to see that the data appears to follow a Poisson 


distribution. A goodness-of-fit test (left as an exercise) confirms the data 
follows a Poisson distribution. 
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FIGURE 6.6: Histogram of SigActs Data 


Use Maple to fit the data. First, remove the three outlier data points 
with values well over 100, as there are other much more significant generators 
of violence beyond literacy levels in those regions. We cannot use Maple’s 


ExponentialFit, as it attempts a log-transformation of SigActs which fails due 
to 0 values. 


| > model := exp(bo + bız) : 


| > outops := [leastsquaresfunction, degreesoffreedom, residualmeansquare, 
residualstandarddeviation, residualsumofsquares] : 

theLabels := | ‘Degrees of Freedom‘, ‘Residual Mean Square‘, 
‘Residual Standard Deviation‘, ‘Residual Sum of Squares‘] : 


> m := Nonlinear Fit(model, L, V, x, output = outops) : 


[> f := unapply(fnormal(mı, 5), £); 
f i= x m e7 0-055437 2 +7.1439 
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| > Matrix({theLabels, m2..5]) : 
Linear Algebra:- Transpose(%); 


‘Degrees of Freedom‘ 76 
‘Residual Mean Square‘ 235.8330320 
‘Residual Standard Deviation’ 15.35685619 
‘Residual Sum of Squares‘ 17923.31043 


Plot the fit. 


| > display( 

pointplot([L, V], symbol = solidcircle, symbolsize = 14), 

plot( f, 70..95, 0..70, thickness = 2), 

title = "Violence v. Literacy", titlefont = [TIMES, 14], 

labels = [SigActs, Count], labeldirections = (horizontal, vertical], 
labelfont = [TIMES, 14] 














); 
Violence v. Literacy 
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We accept that the fit looks pretty good. 

The odds multiplier, e?®!, for our fit is e7 0-055437 ~ 0.946 which means that 
for every 1 unit increase in literacy we expect violence to go down ~% 5.4%. 
This value suggests improving literacy will help ameliorate the violence. 


Poisson Regression with Multiple Predictor Variables in Maple 


Often, there are many variables that influence the outcome under study. We’ll 
add a second predictor to the Hospital Births problem. 
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Example 6.12. Hospital Births Redux. 
Revisit Example 6.10 with an additional predictor: the type of hospital, rural 
(0) or urban (1). the new data appears in Table 6.13. 


TABLE 6.13: Total Births vs. Caesarean Surgeries and Hospital Type 


Total | 3246 2750 2507 2371 1904 1501 1272 1080 1027 970 
Special | 26 24 21 21 21 20 19 18 18 17 
Type 1 1 1 1 1 1 1 1 1 1 





Total | 739 679 502 236 357 309 192 138 100 95 
Special | 17 16 16 16 16 15 14 14 13 13 
Type] 1 1 1 1 1 0 1 0 0 0 








The data has been entered as B: Total, C: Special, and T: Type. After 
loading the Statistics package, define the model. 


| > model := exp(a +b- £ +c- y); 


model := etthetey 


Collect the data and use NonlinearFit to fit the model. 
[> data := ((C) | (T) | (B)) : 


| > outops := [leastsquaresfunction, degreesoffreedom, residualmeansquare, 
residualstandarddeviation, residualsumofsquares] : 
theLabels := [‘Degrees of Freedom‘, ‘Residual Mean Square‘, 
‘Residual Standard Deviation‘, ‘Residual Sum of Squares‘] : 


F> m := Nonlinear Fit(model, data, |x, y|, output = outops) : 


[> f := unapply(fnormal(mı, 5), £); 
f i= x m e®-15397 141.2047 y+3.0022 


|> Matrix([theLabels, mə..5]) : 
LinearAlgebra:- Transpose(%); 


‘Degrees of Freedom‘ 17 
‘Residual Mean Square‘ 124410. 
‘Residual Standard Deviation‘ 352.72 
L ‘Residual Sum of Squares‘ 2.1150 10° 


Finishing the statistical analysis of the model is left as an exercise. 
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Exercises 


1. Adjust the nonlinear model for Afghanistan casualties, Example 6.5, to 
increase the amplitude of the sine term more quickly. How does the con- 
clusion change, if at all? 


2. Investigate the action of parameters in the logistic function by executing 
the Maple statements below using the Explore command to make an inter- 
active graph. 


| > LP := (a,b) > plot(1/(1 + exp(a-x+6)),2 = —4..4,y = —0.1..1.1) : 
| > Explore(LP(a,b),a = —10.0..10.0, b = —10.0..10.0); 





3. For the data in Table 6.14 (a) plot the data and (b) state the type of 
regression that should be used to model the data. 


TABLE 6.14: Tire Tread Data 


Number Hours Tread (cm) 





1 2 5.4 
2 5 5.0 
3 7 4.5 
4 10 3.7 
5 14 3.5 
6 19 2.5 
7 26 2.0 
8 31 1.6 
9 34 1.8 
10 38 1.3 
11 45 0.8 
12 52 1.1 
13 53 0.8 
14 60 0.4 
15 65 0.6 
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4. Assume the suspected nonlinear model for the data of Table 6.15 is 


> 


Ye, 
yt 
If we use a log-log transformation, we obtain 
In(Z) = In(a) + bln(x) — cln(y). 


Use regression techniques to estimate the parameters a, b, and c, and 
statistically analyze the resulting coefficients. 


TABLE 6.15: Nonlinear Data 


x y Z 
101 15 0.788 
73 3 304.149 
122 5 98.245 
56 20 0.051 
107 20 0.270 
77 5 30.485 
140 15 1.653 
66 16 0.192 
109 5 159.918 
103 14 1.109 
93 3 699.447 
98 4 281.184 
76 14 0.476 
83 5 54.468 
113 12 2.810 
167 6 144.923 
82 5 79.733 
8 6 21.821 
103 20 0.223 
86 11 1.899 
67 8 5.180 
104 13 1.334 
114 5 110.378 
118 21 0.274 
94 5 81.304 
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5. Using the basic linear model y = bo + (12, fit the following data sets. 
Provide the model, the analysis of variance information, the value of R?, 
and a residual plot. 


(a) x|100 125 125 150 150 200 200 
y | 150 140 180 210 190 320 280 





x | 250 250 300 300 350 400 400 
y | 400 430 440 390 600 610 670 


(b) The following data represents change in growth where x is body 
weight and y is normalized metabolic rate for 13 animals. 








x| 110 115 120 230 235 240 360 
y | 198 173 174 149 124 115 130 





x | 362 363 500 505 510 515 
y | 102 95 122 112 98 96 








6. Use an appropriate multivariable-model for the following ten observations 
of college acceptances to graduate school of GRE score, high school GPA, 
highly selective college, and whether the student was admitted. 1 indicates 
“Yes” and 0 indicates “No.” 


GPA GRE Selective Admitted 





3.61 380 0 1 
3.67 660 1 0 
4.00 800 1 0 
3.19 640 0 0 
2.93 520 0 1 
3.00 760 0 0 
2.98 560 0 0 
3.08 400 0 1 
3.39 540 0 0 
3.92 700 1 1 


7. The data set for lung cancer in relation to cigarette smoking in Table 6.16 is 
from Frome, Biometrics 39, 1983, pg. 665-674. The number of person years 
in parentheses is broken down by age and daily cigarette consumption. 
Find and analyze an appropriate multivariate model. 
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TABLE 6.16: Lung Cancer Rates for Smokers and Nonsmokers 
Number Smoked per day 
Age | Nonsmokers 1-9 10-14 15-19 20-24 25-34 > 35 
15-20 | 1 (10366) | 0 (3121) 0 (3577) 0 (4319) 0(5683) 0 (3042) 0 (670) 
20-25 | 0 (8162 0 (2397) 1 (3286) 0 (4214) 1 (6385) 1 (4050) 0 (1166) 
25-30 | 0 (5969 0 (2288) 1 (2546) 0 (3185) 1(5483) 4 (4290) 0 (1482) 
30-35 | 0 (4496 0 (2015) 2 (2219) 4(2560) 6 (4687) 9 (4268) 4 (1580) 
35-40 | 0 (3152 1 (1648) 0 (1826) 0 (1893) 5 (3646) 9 (3529) 6 (1136) 
40-45 | 0 (2201 2 (1310) 1 (1386) 2 (1334) 12 (2411) 11 (2424) 10 (924) 
45-50 | 0 (1421 0 (927) 2(988) 2(849) 9(1567) 10(1409) 7 (556) 
50-55 | 0 (1121 3 (710) 4(684) 2(470) 7(857) 5 (663) 4 (255) 
>55 2 (826) 0 (606) 3 (449) 5 (280) 7(416 3(284) 1 (104) 








8. Model absences from class where: 


School: 

Gender: 
Ethnicity: 
Math Test: 
Language Test: 


school 1 or school 2 
female is 1, male is 2 
categories 1 through 6 
score 


score 





Bilingual: categories 1 through 4 
School Gender Ethnicity Math Score Lang. Score Bilingual Days 
Status Absent 
1 2 4 56.98 42.45 2 4 
1 2 4 37.09 46.82 2 4 
2 1 4 32.37 43.57 2 2 
1 1 4 29.06 43.57 2 3 
2 1 4 6.75 27.25 3 3 
1 1 4 61.65 48.41 0 13 
1 1 4 56.99 40.74 2 11 
2 2 4 10.39 15.36 2 7 
1 2 4 50.52 51.12 2 10 
1 2 6 49.47 42.45 0 9 
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Projects 


Project 1. Fit, analyze, and interpret your results for the nonlinear model 
y = at? with the data provided below. Produce fit plots and residual graphs 
with your analysis. 

t|7 14 21 28 35 42 

y|8 41 133 250 280 297 





Project 2. Fit, analyze, and interpret your results for an appropriate model 
with the data provided below. Produce fit plots and residual graphs with your 
analysis. 

Year]0 1 2 3 4 5 6 7 8 9 10 
Quantity | 15 150 250 275 270 280 290 650 1200 1550 2750 





Project 3. Fit, analyze, and interpret your results for the nonlinear model 
y = at? with the data provided by executing the Maple code below. Produce 
fit plots and residual graphs with your analysis. Use your phone number (no 
dashes or parentheses) for PN. 


|> randomize(PN) : 
| > f := unapply(evalf(rand(1.0..9.0)() « e7™@r4(—?-0-.2.0)0). x) : 


T> ¢:= [(6- k)8(k = 1..20)] : 
L y= f~(t) + [seq(0.001 - rand(50)(),7 = 1..20)] : 
> data := Matriz({t, y]); 








6.5 Conclusions and Summary 


Along with investigating regression, we’ve studied some of the common mis- 
conceptions decision makers have concerning correlation and regression. Our 
purpose with this presentation is to help prepare more competent and confi- 
dent problem solvers. Data can be found using part of a sine curve where the 
correlation is quite poor, close to zero, but the decision maker can describe and 
utilize the pattern seeing the relationship in the data as periodic or oscillating. 
Examples such as these should dispel the idea that correlation very close to 
zero implies no relationship, and that high linear correlation requires a lin- 
ear model. Decision makers need to see and understand concepts concerning 
correlation, linear relationships, and nonlinear, or even no relationship. 
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RECOMMENDED STEPS FOR REGRESSION ANALYSIS: 


STEP 1. Insure you understand the problem and what answers or predictions 
are required. 


STEP 2. Get the data that is available. Identify the dependent and indepen- 
dent variables. 


STEP 3. Plot the dependent versus each independent variable, and note any 
apparent trends. 


STEP 4. If the dependent variable is binary {0,1}, then use binary logistic 
regression. If the dependent variables are counts that follow a Poisson distribu- 
tion, then use Poisson regression. Otherwise, try linear, multiple, or nonlinear 
regression as indicated by the situation being studied—science trumps curve 
fitting. 


STEP 5. Insure your model produces results that are acceptable. Always use 
the common-sense test. 
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Problem Solving with Game Theory 











Objectives: 

(1) Know the concept of formulating a game payoff matrix. 
(2) Understand total and partial conflict games. 

(3) Understand the use of LP and NLP in game theory. 


(4) Understand and interpret the solutions. 


In 1943, General Imamura had been ordered to transport Japanese 
troops across the Bismarck Sea to New Guinea; General Kenney, the 
United States commander, wanted to bomb the Japanese troop trans- 
ports prior to their arrival. Imamura had two possible routes to New 
Guinea: a shorter northern route or a longer southern route. Kenney 
had to decide where to send his limited number of search planes to find 
the Japanese fleet. If Kenney sent his planes to the wrong route, he 
could recall them, but the number of bombing days would be reduced. 

Assume both Imamura and Kenney act rationally, each trying to 
obtain the best outcome. The problem to solve is: What are the strate- 
gies each commander should employ? 





7.1 Introduction 


We begin by studying conflict—an important theme in human history. We 
assume that conflict arises when two or more individuals with different views, 
goals, or objectives compete to control the course of future events. Game the- 
ory studies competition and is used to analyze conflict among two or more 
opponents. Mathematical tools are used to study situations in which rational 
players are involved in conflict both with and without cooperation. According 
to Wiens [Wiens2003], game theory studies situations in which parties com- 
pete, and also possibly cooperate, to influence the outcome of interactions to 
each party’s advantage. The situation involves conflict between the partici- 
pants, called players, because some outcomes favor one player at the possible 
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expense of the other players. What each player obtains from a particular out- 
come is called the player’s payoff. Each player can choose among a number 
of strategies to influence payoffs. However, each player’s payoff depends on 
the other players’ choices. According to Straffin [Straffin1993], rational players 
desire to maximize their own payoffs. Game theory is a branch of applied math- 
ematics that is used most notably in economics, and also in business, biology, 
decision sciences, engineering, political science, international relations, opera- 
tions research, applied mathematics, computer science, and philosophy. Game 
theory mathematically captures behavior in strategic situations in which an 
individual’s success in making choices depends on the choices of their oppo- 
nents. Although initially developed to analyze competitions in which one indi- 
vidual does better at another’s expense (See “zero-sum games”), game theory 
has grown to treat a wide class of interactions among players in competition. 
Games have many features; a few of the most common are: 


Number of Players: Each participant who can make a choice in a game 
or who receives a payoff from the outcome of those choices is a player. A 
two-person game has two players. A three or more person is referred to as 
an N-person game. 


Strategies per Player: Each player chooses from a set of possible actions, 
known as strategies. In a two-person game, we can form a grid of strategies. 
We allow the “row player” to have up to m strategies and the “column 
player” to have up to n strategies. The choice of a particular strategy by 
each player determines the payoff to each player. 


Pure Strategy Solution: Ifa player should always choose one strategy over 
all other strategies to obtain their best outcome in a game, then that 
strategy represents a pure strategy solution. Otherwise if strategies should 
be played randomly, then the solution is a mized strategy solution. 


Nash Equilibrium: A Nash! equilibrium is a set of strategies which repre- 
sent mutual best responses to the other player’s strategies. In other words, 
if every player is playing their part of Nash equilibrium, no player has an 
incentive to unilaterally change their strategy. Considering only situations 
where players play a single strategy without randomizing (a pure strategy), 
a game can have any number of Nash equilibria. 


Sequential Game: A game is sequential if one player performs her/his 
actions after another; otherwise, the game is simultaneous. 


Simultaneous Game: A game is simultaneous if the players each choose 
their strategy for the game and implement them at the same time. 





1 John Forbes Nash, Jr., the subject of the 2001 movie “A Beautiful Mind,” received the 
John von Neumann Theory Prize in 1978 for his discovery of the Nash Equilibrium. He also 
received both the Nobel Prize in Economics (1994) and the Abel Prize (2015) for his work 
in game theory. 
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Perfect Information: A game has perfect information if either in a sequen- 
tial game, every player knows the strategies chosen by the players who 
preceded them, or in a simultaneous game each player knows the other 
players’ strategies and outcomes in advance. 


Constant Sum or Zero-Sum: A game is a constant sum game if the sums 
of the payoffs are the same for every set of strategies, and a zero-sum game 
if the payoff sum is always zero. In these games, one player gains if and 
only if another player loses; otherwise, we have a variable sum game. 


Extensive Form: The game is presented in a tree diagram. 


Normative Form: The game is presented as a payoff matrix. In this chapter 
we only present the normative form and its associated solution method- 
ologies. 


Outcomes: An outcome is a set of payoffs resulting from the actions or strate- 
gies taken by all players. 


Total Conflict Game: A game between players where the sums of the out- 
comes for all strategy pairs are either the same constant or zero. 


Partial Conflict Game: A game whose outcome sums are variable. 


The study of game theory has provided many classical and standard games 
that provide insights into play, tactics, and strategy. Table 7.1 provides a short 
summary of several classical games; a full list is at: 

http: //en.wikipedia.org/wiki/List_of_games_in_game_theory 





We will primarily be concerned with two-person games. The irreconcil- 
able, conflicting interests between two players in a game surprisingly resemble 
both parlor games and military encounters between enemy states. Giordano 
et al. [GFH2014] explain two-person games in the context of mathematical 
modeling. Players make moves and counter-moves, until the rules of engage- 
ment declare the game is ended. The rules of engagement determine what each 
player can or must do at each stage—the available and/or required moves given 
the circumstances of the game at the current stage—as the game unfolds. For 
example, in the game Rock, Paper, Scissors, both players simultaneously make 
one move, with rock beating scissors beating paper beating rock. While this 
game consists of only one selection of a move (decision choice), games like 
Chess or Go can require hundreds of moves to end. 

Outcomes or payoffs in a game are determined by the strategies players 
choose and play. These outcomes may come from calculated values or expected 
values, ordinal rankings, cardinal values developed from a lottery system (See 
[vNM1944], [Straffin1993]), or cardinal values derived from pairwise compar- 
isons (See [Fox2014]). Here we will assume we have cardinal outcomes (interval 
or ratio data) or payoffs for our games, since this will allow us to do mathe- 
matical calculations. 


286 


TABLE 7.1: Classical Two-Player Games in Game Theory 


Problem Solving with Game Theory 

















Strategies Pimba Perfect Z 
Game “a er Pure Strategy Sequential Informa- N 
pEr Yer Nash Equilibria tion ve 
Battle of 
ihe Sexes 2 2 No No No 
Blotto Games variable variable No No Yes 
Chicken 2 2 No No No 
Matching 2 0 No No Yes 
Pennies 
Nash 
Bargaining infinite infinite No No No 
Game 
Prisoner’s 2 1 No No No 
Dilemma 
Back, Paper; 3 0 No No Yes 
Scissors 
Stag Hunt 2 2 No No No 
Trust Game infinite 1 Yes Yes No 





We will present only the movement diagram for finding pure strategy solu- 
tions, and the linear programming formulation for all solutions of a zero-sum 
game. There are other methods, also short-cut methods, available to solve 
many of total-conflict games. For more information on short-cut methods, see 
[Straffin1993] and the other suggested readings (pg. 336). 

For partial conflict games, we will present 


e the movement diagram for determining pure strategy solutions, if they 
exist, 


e linear programming formulations for two-person two-strategy games for 
equalizing strategies, 


e nonlinear programming methods for more than two strategies for each 
player, and 


e linear programming methods to find security levels, as all players seek to 
maximize their preferred outcome. 


We use the concept that every partial conflict game has a Nash’s equaliz- 
ing mixed-strategy solution even if the game has a pure strategy solution 
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([GH2009]). We conclude by briefly discussing the Nash arbitration scheme 
and its nonlinear formulation. 

Concepts and solution methodologies of N-person games, such as three- 
person total- and partial-conflict games, will be left to future studies. 








7.2 Background of Game Theory 


Game theory is the study of strategic decision making; that is, “the study 
of mathematical models of conflict and cooperation between intelligent ratio- 
nal decision-makers” ([Myerson1991]). Game theory has applications in many 
areas of business, military, government, networks, and industry. For more 
information on applications of game theory in these areas, see [CS2001], 
[Cantwell2003], [EK2010], and [Aiginger1999]. Additionally, [MF2009] dis- 
cusses game theory in warlord politics which blends military and diplomatic 
decisions. 

The study of game theory began with total conflict games, also known as 
zero-sum games, such that one person’s gain exactly equals the net losses of 
the other player(s). Game theory continues to grow with application to a wide 
range of applied problems. A “Google Scholar” search returns over 3 million 
items. 

The Nash equilibrium for a two-player, zero-sum game can be found by 
solving a linear programming problem and its dual solution ([Dantzig1951] and 
[Dantzig2002], [Dorfman1951]). In their work, Dantzig and Dortman, respec- 
tively, assume that every element of the payoff matrix containing outcomes 
or payoffs to the row player M;,; is positive. More current approachs (e.g., 
[Fox2008] and [Fox2010]) show the payoff matrix entries can be positive or 
negative. 


7.2.1 Two-Person Total Conflict Games 


We begin with characteristics of the two-person total conflict game following 
[Straffin1993]: 

There are two participants: the first, Rose, is the row player, and the other, 
Colin, is the column player. 

Rose must choose from among her 1 to m strategies, and Colin must choose 
from among his 1 to n strategies. 

If Rose chooses the ith strategy and Colin the jth strategy, then Rose 
receives a payoff of a;; and Colin loses the amount a;;. In Table 7.2, this is 
shown as a payoff pair where Rose receives a payoff of M;j and Colin receives 
a payoff of Nij. 

Games are simultaneous and repetitive. 
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There are two types of possible solutions. A pure strategy solution is where 
each player achieves their best outcomes by always choosing the same strat- 
egy in repeated games. A mized strategy solutions is where players play a 
random selection of their strategies in order to obtain the best outcomes in 
simultaneous repeated games. 

Although we do not address them in this chapter, in sequential games, the 
players look ahead and reason back. 

Table 7.2 shows a generic payoff matrix for simultaneous games. 


TABLE 7.2: General Payoff Matrix of a Two-Person Total Conflict Game 


Colin’s Strategies 





Column 1 Column 2 as Column n 
Row 1 (Mia, N11) (My 2, N12) e. (Min, Nin) 
Rose’s Row 2 | (M21, N21)  (M22,No2) .--. (Man, Nan) 
Strategies . ; ; 
Row m (Mm,1; Nm,1) (Mm,2, Nm.2) sad (Mren: Nmn) 





A game is a total conflict game if and only if the sum of the pairs Mi ;+N;i,j 
always equals either 0 or the same constant c for all strategies i and 7. If the 
sum equals zero, then we list only the row payoff M;;. 

For example, if a player wins x when the other player loses x, then the 
sum Mi j + Nij = x —« = 0. Ina business marketing strategy, if one player 
gets x% of the market, then the other player gets y% = 100 — r% based upon 
100% of the market. We list only x% as the outcome because when the row 
player receives x%, the column player loses «%. 


Movement Diagrams 


A movement diagram has arrows in each row (vertical arrow) and column 
(horizontal arrow) from the smaller payoff to the larger payoff. If there exists 
one or more payoffs where all arrows point towards it, then those payoffs 
constitute pure strategy Nash equilibriums. 


Example 7.1. Baseball Franchises. 

Several minor league baseball teams want to enter the market in a new area. 
The teams can choose to locate in a more densely populated area, or less 
densely populated town surrounded by other towns. Assume that both the 
National and American Leagues are interested in the new franchise. Suppose 
the National League will locate a franchise in either a densely populated area 
or a less densely populated area. The American League is making the same 
decision—they will locate either in a densely populated area or a less dense 
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area. This situation is similar to that of the Cubs and the White Sox both being 
in Chicago, or the Yankees and Mets both being in New York City. Analysts 
have estimated the market shares. We place both sets of payoffs in a single 
game matrix. Listing the row player’s payoff, the American League’s payoff, 
as first in the ordered pair, we have the payoff matrix shown in Table 7.3. 
The payoff matrix represents a constant-sum total-conflict game. Arrows are 
added to the table to create the movement diagram. 


TABLE 7.3: New Baseball Franchise Payoff Matrix and Movement Diagram 


National League Franchise 





Densely Un 
Populated Densely 
p Populated 





Densely 
American | Populated (65,35) <= (70,30) 


League f T 
Less 


Franchise 
Densely (55,45) <——— (60, 40) 
Populated 








The payoff (65, 35) only has arrows pointing in for the densely populated 
areas choice for franchise strategies for both players, no arrow exits that out- 
come. The movement diagram indicates that neither player can unilaterally 
improve their solution giving a Nash equilibrium ([Straffin1993]). 


Linear Programming in Total Conflict Games 


Von Neumann’s minimax theorem ([vNeumann1928]) states that for every 
two-person, zero-sum game with finitely many strategies, there exists an out- 
come value V and a set of strategies for each player, such that 


(a) Given Player 2’s strategy, the best payoff possible for Player 1 is V, and 
(b) Given Player 1’s strategy, the best payoff possible for Player 2 is —V. 


Equivalently, Player 1’s strategy guarantees him a payoff of V regardless of 
Player 2’s strategy, and similarly Player 2 can guarantee a payoff of —V regard- 
less of Player 1’s strategy. The name minimaz arises from each player mini- 
mizing the maximum payoff possible for the other; since the game is zero-sum, 
the Player also minimizes his own maximum loss; i.e., maximize his minimum 
payoff. 

Every total conflict game may be formulated as a linear programming prob- 
lem ({Dantzig1951] and [Dorfman1951]). Consider a total-conflict two-person 
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game in which maximizing Player X has m strategies and minimizing Player 
Y has n strategies. The entry (Mij, Nij) from the ith row and jth column 
of the payoff matrix represents the payoff for those strategies. The following 
formulation, using only the elements of M;; for the maximizing player, pro- 
vides results for the value of the game and the probabilities x; of outcomes 
([GFH2014], [Fox2011 Maple], and [Winston2002]). 

If there are negative entries in the payoff matrix, a slight modification 
to the linear programming formulation is necessary since all variables must 
be non-negative when using the simplex method. To obtain a possible nega- 
tive value solution for the game, use the method described in [Winston2002]: 
replace any variable that could take on negative values with the difference 
of two positive variables. Since only V, the value of game, can be positive 
or negative, replace V with V = V; — Vj with both new variables positive. 
The other values we are looking for are probabilities which are always non- 
negative. In these games, players want to maximize the value of the game that 
they receive. The Linear Program (7.1) is a linear programming formulation 
for finding the optimal strategies and value of the game. 


Maximize V 























subject to 
Mı azı + M2182 +--+ + Mmitm—V 20 
My 2% + Mg 9% +++++ Mm 28m- V = 0 (7.1) 
My mti + Mgmt + + Mnntm—-V>O 
T+@g++-+-+2%m=1 
with V,2;>0 


The weights x; yield Rose’s strategy, V is the value of the game to Rose. When 
the solution to this total conflict game is obtained, we also have the solution to 
Colin’s game through the solution of the dual linear program ([Winston2002]). 
As an alternative to the dual, we can formulate Colin’s game directly as shown 
in (7.2) using the original N,;s. We call the value of the game for Colin v to 
distinguish it from Rose’s value V. Colin’s linear program is 


Maximize v 

















subject to 
Ny iyi + Niay t et Ninn —v = 0 
No 1Y1 + No2yo +++: + Na nyn —v = 0 (7.2) 
Nm Yi + Nm,2Y2 roe m,nUn =u > 0 
Yı +yz +: ye SL 


with V, y; > 0 
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The weights y; yield Colin’s strategy, v is the value of the game to Colin. 
Put Example 7.1 into the two formulations and solve to obtain the solution 


t= 1, £2 = 0, yielding V= 65, 
yı = 1, y2 = 0, yielding v = 35 


The overall solution is that each league should place its new franchise in a 
densely populated area giving the solution of a (65,35) market share split. 

The primal-dual simplex method only works in the zero-sum game format 
([Fox2010]). We may convert this game to a zero-sum form to obtain the 
solution via linear programming. Since this is a constant sum game, whatever 
the American League gains, the national League loses. For example out of 
100%, if the American League franchise gains 65%, then the National League 
franchise loses 65% of the market as in Table 7.4 


TABLE 7.4: Zero-Sum Game Payoff Matrix for a New Baseball Franchise 


National League Franchise 





Densely Less Densely 
Populated Populated 





American Densely 65 67 
League Populated 


Franchise | Less Densely 
Populated 


55 60 








For a zero-sum game, we only need a single formulation of the linear pro- 
gram. The Row Player maximizes and the Column Player minimizes with 
rows’ values constituting a primal and dual relationship. The linear program 
used in zero-sum games is equivalent to the formulation in (7.1) with aij for 
Mij designating the zero-sum outcomes for the Row Player; the linear program 
is shown in (7.3). 


Maximize V 








subject to 
41,101 + a212 +++ + am, 18m- V > 0 
Q1,2%1 T 02,22 T tt T üm, 2Ym — V > 0 (7.3) 
Aim F Q2 mT T tt T Am nim — V > 0 
y+ HXQ+++++ 2m =1 








with V, x; > 0 
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where V is the value of the game, a; j are the payoff-matrix entries, and z;’s 
are the weights (probabilities to play the strategies). 

For the baseball franchise example, place the payoffs into (7.3), letting V 
be the value of the game to the Row Player, the American League, giving 


Maximize V 
subject to 
652, + 55x2- V > 0 
7021, + 6022 —V > 0 
V, £1, £2 20 





Maple solves this LP easily. 
|> with( Optimization) : 
[> Obj :=V: 


[> RowConstraints := {65-21 +55-x2—V >0,70-21+60-22—-V > 0, 
zı + T2 < 1, £1 + £2 > 1, 
|  V>20,%ı>0,z2>0} 
| > RowSoln := LPSolve( Obj, RowConstraints, maximize); 
fnormal(RowSoln{2], 4); 
RowSoln := [65.0000000671188, [V = 65.0000000671188, 
xı = 1.00000000103260, xə = 0.]] 
L [V = 65.00, xı = 1.000, z2 = 0.] 


We applied fnormal to the result to eliminate numerical error artifacts. 
Now for the other player. 





[> Obj:=v: 


[> ColConstraints := {35 -yı + 30+ y2 — v > 0,45 ; y1 + 40 - y2 — v > 0, 
Yı + Y2 < 1, yı + yo > 1, 

| = V>0,y1 > 0, y2 > OF 

[> ColSoln := LPSolve( Obj, ColConstraints, maximize); 


fnormal( ColSoln[2], 4); 
ColSoln := [35.0000000361409, [v = 35.0000000361409, 


yı = 1.00000000103260, yo = 0.]] 
[v = 35.00, yı = 1.000, y2 = 01] 


Note that the solutions are not exact; this is due to numerical methods used 
internally by Maple and why we applied fnormal to the result. 

The optimal solution strategies found are identical, as before, with both 
players choosing a more densely populated area as their best strategy. The use 
of linear programming is quite efficacious for large games between two players 
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each having many strategies ({[Fox2010], [Fox2014], and [GFH2014]). We note 
that the solution to the National League franchise game is found either as the 
dual solution, (See [Winston2002], Section 11.3) or by simply re-solving the 
linear program from Column Player’s perspective. In Chapter 4 of Advanced 
Problem Solving with Maple™: A First Course, we showed how Maple can be 
used to solve both the primal and dual linear programs. 


The Partial Conflict Game 


Partial conflict games are games in which one player wins, but the other player 
does not have to lose. Both players could win something or both could lose 
something. Solution methods for partial conflict games include looking for 
dominance, analyzing movement diagrams, and finding equalizing strategies. 
Here we present an extension from the total conflict game to the partial conflict 
game as an application of linear programming; see [Fox2010] and [Fox2014]. 
Because of the nature of partial conflict games where both players are trying 
to maximize their outcomes, we can model all players’ strategies as their 
own maximizing linear programs. We treat each player as a separate linear 
programming maximization problem. 

Again, use the payoff matrix of Table 7.2. Now assume that Mj; + Ni; 
is not always equal to zero or the same constant for all ¿i and j. In non- 
cooperative partial-conflict games, we first look for a pure-strategy solution 
using a movement diagram. 

The Row Player, Rose, maximizes payoffs, so she would prefer the highest 
payoff in each column. Vertical arrows are in columns with Rose’s values. 
Similarly, The Column Player, Colin, maximizes his payoffs, so he would prefer 
the highest payoff in each row. Draw an arrow to the highest payoff in that 
row. Horizontal arrows are in rows with Colin’s values. If all arrows point to 
a cell from every direction, then that cell will be a pure Nash equilibrium. 

If all the arrows do not point at a value or values, i.e., there is no pure Nash 
equilibrium, then we must use equalizing strategies to find the weights (prob- 
abilities) for each player. For a game with two players having two strategies 
each, proceed as follows: 


Rose’s game: Rose maximizing and Colin “equalizing” is a total-conflict 
game that yields Colin’s equalizing strategy. 


Colin’s game: Colin maximizing and Rose “equalizing” is a total-conflict 
game that yields Rose’s equalizing strategy. 


Note: If either side plays its equalizing strategy, the other side cannot unilat- 
erally improve its own situation—the other player is stymied. 


This analysis translates into two maximizing linear programming formulations 
shown in (7.4) and (7.5) below. The LP formulation in (7.4) provides the 
Nash equalizing solution for Colin with strategies played by Rose, while the 
LP formulation in (7.5) provides the Nash equalizing solution for Rose with 
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strategies played by Colin. 








Maximize V Maximize v 
subject to subject to 
Niit%1+ No1t%2-V>0 (7.4) Miiyi+ Mi2y2-—v>0 (7.5) 
Ni 2%, + No9%2 —-V = 0 Mo1y1 + Me 2y2—-v > 0 
V, 21,22 20 v, yı, y2 2 0 


If there is a pure strategy solution, it is found through movement diagrams 
or dominance. The linear programming formulations will not find pure strat- 
egy results; LPs only provide the Nash equilibrium using equalizing strategies 
([Straffin1993}). 

For games with two players and more than two strategies each, Bazarra et 
al. (See [BSC2013]) presented a nonlinear optimization approach. Consider a 
two-person game with a standard payoff matrix. Separate the payoff matrix 
into two matrices M and N for Players I and II. Then solve the nonlinear 
optimization formulation given in expanded form in (7.6). 


n n n n 
Maximize ys Qij Yj + Sai bij Yj-—P-d 


i=1 j=1 i=1 j=1 
subject to 


m 
X aig uy <p, i= 1,2,...,n, 
j=l 


Soave <q, i=1,2,...,m, (7.6) 


i=l 


n m 
Yn=Suan 
i=1 j=l 


Example 7.2. A Partial Conflict Equalizing Strategy Game Solution. 
Table 7.5 shows a partial-conflict game with revised payoff estimates of market 
shares. The arrows in the movement diagram indicate that the game has no 
pure strategy solution. 
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TABLE 7.5: Partial-Conflict Movement Diagram 














Colin 
Large City Small City 
Large City (20,40) <— (10,0) 
Rose Į T 
Small City | (30,10) — = (0,40) 
We use (7.4) and (7.5) to formulate and solve these partial conflict games 


for their Nash equalizing strategies. 


Maximize Vc 
subject to 
402, + 10%2 — Ve > 0 
Ox, + 40x22 — Ve > 0 
zı + z2 = 1 
Vc, x1, £2 > 0 


(7.7) 


Maximize Vr 
subject to 
20y1 + 10y2 — Vr > 0 
30y1 + Oy2— Vr > 0 
yı +y2=1 
Vr, y1, Y2 > 0 


(7.8) 





The solutions to this partial conflict game are 
(a) Ve = 22.857 when x; = 0.571 and z2 = 0.429, and 
(b) Vr = 15.000 when yı = 0.500 and y2 = 0.500. 


This game results in Colin playing Large City and Small City each half the 
time, ensuring a value of 15.00 for Rose. Rose plays a mixed strategy of 4/7 
Large City and 3/7 Small City which yields a value of the game of 22.857 for 
Colin. 

To be a solution, the Nash equilibrium must be Pareto optimal (no other 
solution is better for both players, northeast region) as defined by Straffin 
(See [Straffin1993]). Visually, we can plot the coordinates of the outcomes 
and connect them into a convex set (the convex hull). The Nash equilibrium 
(15, 22.85) is an interior point, and so it is not Pareto optimal; see Figure 7.1. 
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40 








Pareto Optimal 


+ 
Nash Equilibrium 











FIGURE 7.1: Payoff Polygon with Nash Equilibrium 


When the results are mixed strategies, the implication is that the game is 
repeated many times in order to achieve that outcome in the long run. 


Example 7.3. A 3 x 3 Nonzero Sum Game. 
Rose and Colin each have three strategies with payoffs shown in Table 7.6. 


TABLE 7.6: Rose and Colin Strategies 


Colin 
Cı C2 C3 
Rı | (—1,1) (0,2) (0,2) 
Rə | (2,1) (1,—1) (0,0) 
R3 | (0,0) (1,1) (1,2) 








Rose 





First, we use a movement diagram to find two Nash equilibrium points. 
They are RoC; = (2,1) and R3C3 = (1,2). These pure strategy solutions are 
not equivalent and trying to achieve them might lead to other results. We 
might employ the nonlinear method described earlier to look for other equi- 
librium solutions, if they exist. We find the nonlinear method does produce 
another solution when p = q = 0.667 when x; = 0, z = 0.667, £3 = 0.333, 
yı = 0.333, y2 = 0, and y3 = 0.667. The Maple statements to obtain this 
solution are: 


[> with( Optimization) : 


T> M := Matriz({[-1, 0,0], [2,1,0], | 
N := Matriz(([1, 2, 2], [1, —1, 0], [ 


2 
|} 
= 
a 





= 
con 
D 
= 
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[> X := Vector[row](3, symbol = x) : 
| Y := Vector(3, symbol = y) : 
| > Objective := expand(X.M.Y+X.N.Y—p-—q): 


| > Constraints := {seq((M .Y); < p,i = 1..3), 
seq((X .N); < q, i = 1..3), 
add(x;, i = 1..3) = 1, add(y;, i = 1..3) = 1} 


Since the objective function is quadratic, use Maple’s quadratic program 
solver, QPSolve, from the Optimization package. 





|> Soln := QPSolve( Objective, Constraints, assume = nonnegative, 

I maximize) : 

| > fnormal(Soln, 3); 

(0., [p = 0.667, q = 0.667, x1 = 0., £2 = 0.667, 3 = 0.333, yı = 0.333, 





Communications and Cooperation in Partial Conflict Games 


Allowing for communication may change the game’s solution. A player may 
consider combinations of moves, threats, and promises to attempt to obtain 
better outcomes. The strategy of moves is explored in several sources listed in 
References and Further Reading (pg. 336). 


Nash Arbitration Method 


When we have not achieved a solution by other methods that is acceptable 
to the players, then a game may move to arbitration. The Nash Arbitration 
Theorem (Nash, 1950) states that 
There is a unique arbitration solution which satisfies the axioms 
Rationality: The solution point is feasible. 
Linear Invariance: Changing scale does not change the solution. 
Symmetry: The solution does not discriminate against any player. 


Independence of Irrelevant Alternative: Eliminating solutions that 
would not be chosen does not change the solution. 


Prudential Strategy (Security Levels) 


The security levels are the payoffs to the players in a partial conflict game 
where each player attempts to maximize their own payoff. We can solve for 
these payoffs using a separate linear program for each security level. 
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The LP formulations are (7.9) and (7.10). 


Maximize V 






































subject to 
M1121 + M21%2 +: + Mm 18m- V => 0 
Mi 2£1 + Mo 2%£2 +: + Mm28%m -V = 0 
i (7.9) 
Mi mxi + M2 mt +: + Mm ním- V >20 
zı Hza +- +£m=l1 
xi <1 fori=1,2,...,m 
with V, x; > 0 
The weights x; yield Rose’s prudential strategy for the security level V. 
Maximize v 
subject to 
Niay + Niat tert NianYn —v = 0 
Noiyi + N2242 +: + N2 nyn- v> 0 
(7.10) 
Nm iyi + Nm2y2 + + Nm nyn- v0 
Yı tyz + +Yyn=l 
yi < 1 for i = 1,2,..., m 
with V, y; > 0 


The weights y; yield Colin’s prudential strategy for the security level v 
Revisit Example 7.2 to illustrate finding security levels. Let SLR and SLC 

represent the security levels for Rose and Colin, respectively. We use linear 

programming to find these values using (7.9) and (7.10) yielding 








Maximize SLR Maximize SLC 
subject to subject to 
202, + 30%. — SLR > 0 40yı + Oyo — SLC > 0 
10zı + Or. — SLR > 0 10x, + 40y2 — SLC > 0 
zı +z2=1 Yı +yz =1 
1,02 <1 Yisy2 <1 
SDR, 2x1,22 > 0 SER, yi, y2 = 0 


The solution yields both how the game is played and the security levels. 
Rose always plays R1, and Colin plays 4/7 Cı and 3/7 C2. The security level 
is (10, 22.86). 
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Using this security level, (10, 22.86), as our status quo point, we can formu- 
late the Nash arbitration scheme. We restate more formally the four axioms 
stated above that are met using the Nash arbitration scheme. 


Axiom 1: Rationality. The solution must be in the negotiation set. 


Axiom 2: Linear Invariance. If either Player 1’s or Player 2’s utility functions 
are transformed by a positive linear function, the solution point should 
be transformed by the same function. 


Axiom 3: Symmetry. If the polygon happens to be symmetric about the line 
of slope +1 through the status quo point, then the solution should be 
on this line. No player is favored, no player is discriminated against. 


Axiom 4: Independence of Irrelevant Alternatives. Suppose N is the solution 
point for a polygon P with status quo point SQP. Suppose Q is another 
polygon which contains both SQP and N, and is totally contained in 
P. Then N should also be the solution point to Q with status quo point 
SQP, i.e., the solution point is not changed by non-solution points being 
eliminated from consideration. 


Theorem. Nash’s Arbitration Theorem (Nash,? 1950). 

There is one and only arbitration scheme which satisfies the four axioms ratio- 
nality, linear invariance, symmetry, and independence of irrelevant alterna- 
tives. The arbitration scheme is: If the status quo (SQP) point is (xo, yo), 
then the arbitrated solution point N is the point (x,y) in the polygon with 
x > Xo, y È yo which maximizes the product Z = (x — xo) (y — yo). 


We apply Nash’s theorem in a nonlinear optimization framework (Kuhn- 
Tucker conditions). The formulation for our example is 


Maximize Z = (x — 10)(y — 22.86) 
subject to 
3x + y = 100 
x>10 
y > 22.86 
zy = 0 
Maple finds the solution easily using QPSolve. In this example, for the 


status quo point (10,22.86), Q@PSolve gives our Nash arbitration point as 
(17.86, 46.43). 





? John F. Nash, Jr., “The Bargaining Problem,” Econometrica, 18(2), 1950, pg. 155-162. 
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7.3 Examples of Zero-Sum Games 


In this section, we present several illustrative examples of the theory of total- 
conflict games. We present the scenario, discuss the outcomes used in the 
payoff matrix, and present a possible solution for the game. In most game 
theory problems, the solution suggests insights in how to play the game rather 
than a definite methodology to “winning” the game. 


Example 7.4. The Battle of the Bismarck Sea. 

The Battle of the Bismarck Sea is set in the South Pacific in 1943. See 
Figure 7.2. The historical facts are that General Imamura had been ordered 
to transport Japanese troops to New Guinea. General Kenney, the United 
States commander in the region, wanted to bomb the troop transports prior 
to their arrival at their destination. Imamura had two options to choose from 
as routes to New Guinea: a shorter northern route or a longer southern route. 
Kenney had to decide where to send his search planes and bombers to find the 
Japanese fleet. If Kenney sent his planes to the wrong route, he could recall 
them, but the number of bombing days would be reduced. 
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FIGURE 7.2: The Battle of the Bismarck Sea. Japanese troops were being 
taken from Rabaul to Lae. (Map Source: The World Fact Book, CIA) 
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We assume that both commanders, Imamura and Kenney, are rational 
players, each trying to obtain his best outcome. Further, we assume that 
there are no communications or cooperation which may be inferred since the 
two are enemies engaging in war. Further, each is aware of the intelligence 
assets that are available to each side and are aware of what the intelligence 
assets are producing. We assume that the estimates of number of days that 
US planes can bomb as well as the number of days to sail to New Guinea are 
accurate. 


The players, Kenney and Imamura, both have the same set of strategies 
for routes: { North, South}, and their payoffs, given as the numbers of exposed 
days for bombing, are shown in Table 7.7. Imamura loses exactly what Kenney 
gains. 


TABLE 7.7: The Battle of the Bismarck Sea with Payoffs (Kenney, Imamura) 


Imamura 
North South 
North | (2,—2) (2, —2) 
South | (1,-1)  (8,—8) 








Kenney 








Graphing the payoffs, Figure 7.3, shows this is a total conflict game. 
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FIGURE 7.3: Graph of Payoffs for the Battle of the Bismarck Sea 
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As a total conflict game, Table 7.8 only needs to list the outcomes to 
Kenney in order to find a solution. 


TABLE 7.8: The Battle of the Bismarck Sea as a Zero Sum Game 








Imamura 
North South 
2 2 
Renney North 
South 1 3 





There is a dominant column strategy for Imamura: to sail North since 
the values in the column are correspondingly less than or equal to the values 
for sailing South. The dominant North column would eliminate the South 
column. Seeing that as an option, Kenney would search North—that option 
provides a greater outcome than searching South, (2 > 1). He could also 
apply the minimax theorem (saddle point method) to find a plausible Nash 
equilibrium as Kenney searches North and Imamura takes the Northern route. 
See Table 7.9. 


TABLE 7.9: Minimax Method (Saddle Point Method) 


























Imamura 
North South | Row Min Max of Min 
North 2 2 2 2 
Kenney 
South 1 3 1 
Column Max 2 3 
Min of Max 2 

















Applied to the Battle of the Bismarck Sea, the Nash equilibrium (North, 
North) implies that no player can do unilaterally better by changing their 
strategy. The solution is for the Japanese to sail North and for Kenney to 
search North yielding 2 bombing days. This result, (North, North), was indeed 
the real outcome in 1943. 

Next, let’s assume that communication is allowed. We will consider first 
moves by each player. If Kenney moved first, (North, North) would remain 
the outcome. However, (North, South) also becomes a valid response with the 
same value of 2. 

If Imamura moved first, (North, North) would be the outcome. What is 
important about moving first in a zero sum game is that, although it gives 
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more information, neither player can do better than the Nash equilibrium 
from the original zero sum game. We conclude from our brief analysis that 
moving first does not alter the equilibrium of this game. Moving first in a zero 
sum games does not alter the equilibrium strategies. 

A Maple solution follows. Remember to begin by entering with( Optimization). 
First, the solution for Kenney: 


[> KenneyObj := V; 

[> KenneyCons := {2. z1 +£2— V > 0,2- £1 +3- £2- V > 0, 
tı ttz laits l, 

V > 0,z1 > 0,22 > 0} 

> fnormal(LPSolve(KenneyObj, KenneyCons, maximize), 3); 

L [2.000, [V = 2.000, xı = 1.000, a2 = 0.]] 


We used fnormal to eliminate artifacts from floating point computations. 
Now, the solution for the Japanese commander, Imamura. 








[> ImamuraObj := v; 


[> ImamuraCons := {2-41 +2-y2-v>0,1-y714+3-y-—v>0, 
yı +y2 7 l,yı +y2 <1, 
v >0,yı > 0,y2 > 0} 
> fnormal(LPSolve(ImamuraObj, ImamuraCons, maximize), 3); 
[2.000, [v = 2.000, yı = —0., y2 = 1.000] 





Example 7.5. Penalty Kicks in Soccer’. 


A penalty kick in soccer is a game between a kicker and the opposing goalie. 
The kicker has two alternative strategies: he might kick left or kick right. The 
goalie will also have two strategies: the goalie can dive left or right to block the 
kick. We will start with a very simple payoff matrix with a 1 for the player 
that is successful and a —1 for the player that is unsuccessful, assuming a 
correct dive blocks the kick. The payoff matrix is in Table 7.10. 


TABLE 7.10: Penalty Kick Payoffs 











Goalie 
Dive Left Dive Right 
. Kick Left (—1,1) (1,—1) 
Kicker 
Kick Right (1,—1) (—1,1) 





3This example is adapted from Chiappori, Levitt, and Groseclose [CLG2002]. 
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Or, just the kicker’s prospective, see Table 7.11. 


TABLE 7.11: Kicker’s Penalty Kick Payoffs 


Goalie 
Dive Left Dive Right 
Kick Left —1 1 
Kick Right 1 —1 








Kicker 





There is no pure strategy. We find a mixed strategy solution to the zero- 
sum game using either linear programming or the method of oddments*. The 
mixed strategy results are that the kicker randomly kicks 50% left and 50% 
right, while the goalie randomly dives 50% left and 50% right. The value of 
the game to each player is 0. 

Let’s refine the game using real data. A study was done in the Italian 
Football League in 2002 by Ignacio Palacios-Huerta.° As he observed, the 
kicker can aim the ball to the left or to the right of the goalie, and the goalie 
can dive either left or right as well. The ball is kicked with enough speed 
that the decisions of the kicker and goalie are effectively made simultaneously. 
Based on these decisions, the kicker is likely to score or not score. The structure 
of the game is remarkably similar to our simplified game. If the goalie dives 
in the direction that the ball is kicked, then he has a good chance of stopping 
the goal; if he dives in the wrong direction, then the kicker is likely to score a 
goal. 

After analyzing approximately 1400 penalty kicks, Palacios-Huerta deter- 
mined the empirical probabilities of scoring for each of four outcomes: the 
kicker kicks left or right, and the goalie dives left or right. His results led to 
the payoff matrix in Table 7.12. 


TABLE 7.12: Penalty Kick Probabilities of Scoring 


Goalie 
Dive Left Dive Right 
Kick Left | (0.58,—0.58) (0.95, —0.95) 
Kick Right | (0.93, —0.93) (0.70, —0.70) 








Kicker 











4See, e.g., [Straffin1993] 
5Palacios-Huerta, “Professionals Play Minimax,” Review of Economic Studies (2003) 70, 
395-415. 
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Applying our solution method to the linear programming formulation finds 
the optimal solution as either pure strategy or mixed strategy. 


| > KickerObj := K; 

| > KickerCons := {0.58-21+0.93-%2—K > 0,0.95-21+0.70-%2—K > 0, 
zı + T2 > 1, z1 +2 <1, 

L K > 0, xı > 0,22 > 0} 

|> fnormal(LPSolve(KickerObj, KickerCons, maximize), 4); 


(0.7958, [K = 0.7958, x, = 0.3833, £2 = 0.6167]] 

> GoalieObj := G; 

[> GoalieCons := {0.42- y1 +0.05-y2— G > 0,0.07 -y1 +0.30 -y2—G > 0, 
yı + y2 2 1l,yı +y2 <1, 

G > 0,yı > 0,y2 > 0} 

> fnormal(LPSolve( GoalieObj, GoalieCons, maximize), 4); 

(0.2042, [G = 0.2042, yı = 0.4167, y2 = 0.5833] 





A short-cut method, the Method of Oddments, is shown in Table 7.13. 


TABLE 7.13: Method of Oddments 

















Goalie 
Dive Left Dive Right | Oddments Probabilities 
0.23/0.60 
0.58 0.95 0.37 
cae) “a = 0.383 
Kick Right 0.93 0.70 0.23 0.37/0.60 
= 0.6166 
Oddments 0.35 0.25 
eat 0.25/0.60 0.35/0.60 
Probabilities ` 0.416 — 0.5833 





We find the mixed strategy for the kicker is 38.3% kicking left and 61.7% 
kicking right, while the goalie dives right 58.3% and dives left 41.7%. If we 
merely count percentages from the data that was collected by Palacios-Huerta 
in his study of 459 penalty kicks over 5 years of data, we find the kicker did 
40% kicking left, and 60% kicking right, while the goalie dove left 42% and 
right 58%. Since our model closely approximates the data, our game theory 
approach adequately models the penalty kick. 

The next example, a batter-pitcher duel, continues the theme of technology 
in sports today. 
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Example 7.6. Batter-Pitcher Duel. 
We extend to four strategies for each player. Consider a batter-pitcher duel 
between Aaron Judge of the New York Yankees, and various pitchers in the 
American League where the pitcher throws a fastball, a split-finger fastball, 
a curve ball, and a change-up. The batter, aware of these pitches, must pre- 
pare appropriately for the pitch. We’ll consider right- and left-handed pitch- 
ers separately in this analysis. Data is available from many websites, such as 
www.STATS.com. 

The data in Table 7.14 has been compiled for an American League right- 
handed pitcher (RHP) versus Aaron Judge. Let FB = fastball, CB = 
curve ball, CH = change-up, SF = split-finger fastball. 


TABLE 7.14: Aaron Judge vs. a Right-Handed Pitcher 





Judge / RHP | FB CB CH SF 
FB 0.337 0.246 0.220 0.200 
CB 0.283 0.571 0.339 0.303 
CH 0.188 0.347 0.714 0.227 
SF 0.200 0.227 0.154 0.500 





Both the batter and pitcher want the best possible result. We set this up as 
a linear programming problem. Our decision variables are x1, £2, £3, and x4 
as the percentages to guess FB, CB, CH, SF, respectively, and V represents 
Judge’s batting average. 


Maximize V 

subject to 
0.337 - xı + 0.283 - £2 + 0.188 - x3 +0.200-24-—-V >0 
0.246 - xı + 0.571 - £2 + 0.347 - x3 +0.227-44-V >0 
0.220 - xı + 0.339 - £2 + 0.714 - x3 +0.154-44-V >0 
0.200 - xı + 0.303 - x2 + 0.227 - x3 + 0.500 -x4—- V > 0 


t+%2+%34+%4=1 











T1, 02,03, 04, V 2 0 


We solve this linear programming problem with Maple, and find the optimal 
solution (strategy) is to guess the fastball (FB) 27.49%, guess the curve ball 
(CB) 64.23%, never guess change-up (CH), and guess split-finger fastball 
(SF) 8.27% of the time to obtain a 0.291 batting average. 

The pitcher also wants to keep the batting average as low as possible. Set 
up the linear program for the pitcher as follows. The decision variables are 
Yı, Y2, Y3, and y4 as the percentages to guess FB, CB, CH, SF, respectively, 
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and V again represents Judge’s batting average. 
Minimize V 
subject to 
0.337 - yy + 0.246 - y2 + 0.220 - y3 + 0.200 - y4- V < 0 
0.283 - y1 + 0.571 - y2 + 0.339 - y3 + 0.303 - y4- V <0 
0.188 - y1 + 0.347 - yo + 0.714 - yg + 0.227 - ys —V <0 
0.200 - yı + 0.227 - y2 + 0.154 - y3 + 0.500 - ys —V <0 
Yı t Y2 tys +y =1 
Y1, Y2:Y3; Ya V 20 











We find the right-handed pitcher (RHP) should randomly throw 65.94% fast- 
balls, no curve balls, 3.24% change-ups, and 30.82% split-finger fastballs for 
Judge to keep, and not increase, his 0.291 batting average. 

Statistics for Judge versus a left-handed pitcher (LHP) are in Table 7.15. 


TABLE 7.15: Aaron Judge vs. a Left-Handed Pitcher 





Judge / LHP | FB CB CH SF 
FB 0.353 0.185 0.220 0.244 
CB 0.143 0.333 0.333 0.253 
CH 0.071 0.333 0.353 0.247 
SF 0.300 0.240 0.254 0.450 





Set up as before, and solve the linear programming problem. 


Maximize V 

subject to 
0.353 - xı + 0.143 - z2 + 0.071 - z3 + 0.300 -x4- V > 0 
0.185 - xı + 0.333 - x2 + 0.333 - x3 + 0.240 - x4- V > 0 
0.220 - xı + 0.333 - x2 + 0.353 - £3 + 0.254- x4- V > 0 
0.244 - xı + 0.253 - x2 + 0.247 - x3 + 0.450 -x4—- V > 0 


z1 + z2 + £3 +4 =1 











T1, 02,03, T4, V 2 0 


We find the optimal solution for Judge versus a LHP. Judge should guess as 
follows: never guess fastball, guess curve ball 24.0%, never guess change-up, 
and guess split-finger fastball 76.0% for a batting average of .262. 
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For the left-handed pitchers facing Judge, solve the following LP: 
Minimize V 
subject to 
0.353 - yy + 0.185 - y2 + 0.220 - y3 + 0.244-y4-V <0 
0.143 - yy + 0.333 - y2 + 0.333 - y3 + 0.253 - ya —V <0 
0.071 - yy + 0.333 - y2 + 0.353 - y3 + 0.247 - ya —V <0 
0.300 - yı + 0.240 - y2 + 0.254 - y3 + 0.450 y4- V <0 
Yı tyoty3t ys = 1 
Y1, Y2, Y3, Y4, V 20 











The pitcher should randomly throw 26.2% fastballs, 62.8% curve balls, no 
change-ups, and no split-finger fast balls. Then Judge’s batting average will 
remain at .262, and won’t increase. 

The manager of the opposing team is in the middle of a close game. There 
are two outs, runners in scoring position, and Judge is coming to bat. Does the 
manager keep the LHP in the game or switch to a RHP? The percentages say 
keep the LHP since 0.262 < 0.291. Tell the catcher and pitcher to randomly 
select the pitches to be thrown to Judge. 

Judge’s manager wants to improve his batting ability against both a curve 
ball and a LHP. Only by improving against these strategies can he effect 
change. 


Example 7.7. Operation Overlord. 

Operation Overlord, the codename for World War II’s Battle of Normandy, 
can be viewed in the context of game theory. In 1944, the Allies were planning 
an operation for the liberation of Europe; the Germans were planning their 
defense against it. There were two known possibilities for an initial amphibious 
landing: the beaches at Normandy, and those at Calais. Any landing would 
succeed against a weak defense, so the Germans did not want a weak defense 
at the potential landing site. Calais was more difficult for a landing, but closer 
to the Allies targets for success. 

Suppose the probabilities of an Allied success are as in Table 7.16. 


TABLE 7.16: Probabilities of a Successful Allied Landing 


German Defense 





Normandy Calais 
Allied Normandy 75% 100% 
Landing | Calais 100% 20% 
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The Allies successfully landing at Calais would earn 100 points, successfully 
landing at Normandy would earn 80 points, and failure at either landing would 
earn 0 points. What decisions should be made? 

We compute the expected values, placing them in the payoff matrix of 
Table 7.17. 


TABLE 7.17: Payoff Matrix for Allied Landing 


German Defense 


Normandy Calais 





Allied Normandy | 0.75 - 80 = 60 1.- 80 = 80 
Calais 1.-100=100 0.2-100 = 20 


Landing 





There are no pure strategy solutions in this example. We use mixed strate- 
gies, and determine the game’s outcome. 


The Allies would employ a mixed strategy of 80% Normandy and 20% 
Calais to achieve an outcome of 68 points. At the same time, the Germans 
should employ a strategy of 60% Normandy and 40% Calais for their defenses 
to keep the Allies at 68 points. 

Implementation of the landing was certainly not two-pronged. So what do 
the mixed strategies imply in strategic thinking? Most likely a strong feint at 
Calais and lots of information leaks about Calais, while the real landing at 
Normandy was a secret. The Germans had a choice as to believe the informa- 
tion about Calais, or somewhat equally divide their defenses. Although the 
true results were in doubt for a while, the Allies prevailed. 


Example 7.8. Choosing the Right Course of Action. 

The US Army Command and General Staff College presented this approach 
for choosing the best course of action (COA) for a mission. For a possible 
battle between two forces, we compute the optimal courses of actions for the 
two opponents using game theory.’ 


Steps 1 and 2. List the friendly COAs, and rank order them Best to Worst. 


COA 1: Decisive Victory 

COA 4: Attrition Based Victory 
COA 2: Failure by Culmination 
COA 3: Defeat in Place 


Step 3. The enemy is thought to have six distinct possible courses of action. 
Rank best to worst each COA of the enemy where the row represents the 





®This example is adapted from [Cantwell2003]. 
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friendly COA. For example, the friendly COA 1 is best against the enemy 
COA 1 and friendly COA 2 is worst against the enemy COA 6. 


Step 4. Decide in each case if we think we will Win, Lose, Draw using 
Table 7.18. 


TABLE 7.18: Enemy COA vs. Friendly COA 
Enemy Course of Action 


COA 1 COA2 COA3 COA4 COA5 COA6 
COA 1 | Best Win Win Draw Loss Loss 








Friendly 


Course | COA 4 | Win Win Win Win Loss Loss 
of COA 2| Win Win Loss Loss Loss Worst 
Action COA 3 | Draw Draw Draw Loss Loss Loss 








Steps 5 and 6. Provide scores. Since there are 4 friendly COAs and 6 enemy 
COAs, we use scores from 24 (Best) to 1 (Worst). See Table 7.19. 


TABLE 7.19: Enemy COA vs. Friendly COA Scores 


Enemy Course of Action 





COA 1 COA2 COA3 COA4 COA5 COA6 





COA 1 | Best Win Win Draw Loss Loss 
Prendi 24 23 22 = 3 2 
Course | COA 4| Win Win Win Win Loss Loss 
of 21 20 19 18 10 9 
Action | COA 2| Win Win Loss Loss Loss Worst 
17 16 11 7 8 1 
COA 3 | Draw Draw Draw Loss Loss Loss 
= = = 6 5 4 





Steps 7 and 8. Put into numerical order for Loss. 


Step 9. Fill in the scores for the draw. See Table 7.20. 
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TABLE 7.20: Enemy COA vs. Friendly COA Scores with Draws 


Enemy Course of Action 





COA 1 COA2 COA3 COA4 COA5 COA6 








COA 1 | Best Win Win Draw Loss Loss 
Friendly 24 23 22 15 3 2 
Course | COA 4| Win Win Win Win Loss Loss 
of 21 20 19 18 10 9 
Action | COA 2] Win Win Loss Loss Loss Worst 
17 16 11 7 8 1 
COA 3 | Draw Draw Draw Loss Loss Loss 
14 13 12 6 5 4 





Step 10. Put the courses of action back in their original order. Add minimax 
data. See Table 7.21. 


TABLE 7.21: Enemy COA vs. Friendly COA Minimax 





























Enemy Course of Action 
(COA) 
1 2 3 4 5 6 Min 
, 1 24 23 22 15 3 2 2 
Friendly 
Course of 2 17 16 11 7 8 1 1 
Action | 3 |14 18 12 6 5 4 4 
(COA) 
4 21 20 19 18 10 9 9 
Max | 24 23 22 15 |10| 18 | No saddle 

















There is no pure strategy solution. Because of the size of the payoff matrix, 
we did not use the movement diagram, but instead used the Minimax theorem. 
Basically, we find the minimum in each row, and then the maximum of these 
minimums. Then we find the maximums in each column, then the minimum of 
those maximums. If the maximum of the row minimum’s equals the minimum 
of the column maximums, then we have a pure strategy solution. If not, we 
have to find the mixed strategy solution. In Step 10 above, the maximum of 
the minimums is 9, while the minimum of the maximums is 10. They are not 
equal. 

Linear programming may be used in zero-sum games to find the solutions 
whether they are pure strategy or mixed strategy solutions. So, we solve this 
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game using a linear program. Let V be the value of the game, xı to x4 be the 
probabilities in which to play strategies (COAs) 1 through 4 for the friendly 
side. The values yı to ye represent the probabilities the enemy should employ 
COA 1 to COA 6, respectively, to obtain their best results. 


Maximize V 
subject to 

242, + 16%q + 1323 + 2lzr4 -V > 0 
232, +17x%2 + 12x3 + 20%, — V > 0 
22%, + 11x2 + 6%3+19%,-V >0 
3%, + Tro+ 54%3+10%4-V >0 
15x 8x2 473+ 9x%4-V>0 
2x lag + 14%3 + 18r4- V > 0 
£1 + £2 + £3 +z4=1 


£1, £2, £3, £4, V 2 0 























The linear program for the enemy is 
Minimize v 
subject to 
24y1 + 23y2 + 22y3 + 3y4 + 15y5 2y6 —v <0 
16y; + 17y2 + 11y3 TY 8Y5 ye —v <0 








13y, + 12y2+ 6y3+ 5y4+ 4ys + ldy6 —v <0 
2lyı + 20y2 + 19y3 + 10y4 + 9y5 + 18y6 — v <0 
Yı + Y2 + Y3 + y4 +ys +ys=1 

Y1, Y2, Y3, Y4, Y5 Y6 V Z 0 

















Maple gives the solution as V = 9.462 when “friendly” chooses x, = 7.7%, 
£2 = 0, z3 = 0, z4 = 92.3%, while the “enemy” best results come when yı = 0, 
yo = 0, ys = 0, y4 = 46.2% and ys = 53.9% holding the “friendly” to 9.462. 

Interpretation: At 92.3%, we see we should defend along the Vistula River 
(COA 4) almost all the time. The value of the game, V = 9.462, is greater than 
the pure strategy solution of 9 for always picking to defend the Vistula River 
(COA 4). This implies that we benefit from secrecy and employing deception. 
We can benefit by “selling the enemy” on our “attack North and fix in the 
South” (COA 1). A negative 9.462 for the enemy does not mean the enemy 
loses. We need to further consider the significance of the values and mission 
analysis. 
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ee 
Exercises 
Solve the problems using any method. 


1. What should each player do according to the payoff matrix in Table 7.22? 


TABLE 7.22: Attack and Defense 





Attack and Defense Colonel Blotto 
Tableau Defend City I Defend City II 
Dı Do 
Attack City I 
Colonel pi u 30 30 
Sotto 


Attack City II 
Ag 


20 0 








Use Table 7.23 below for Exercises 2. to 10. 


TABLE 7.23: Payoff Table 


Payoff Colin 


Tableau ĉi Cə 
a b 
Rose 
c d 





2. What assumptions have to be true for a at R,C to be the pure strategy 
solution? 


3. What assumptions have to be true for b at RıCı to be the pure strategy 
solution? 


4. What assumptions have to be true for c at RiC to be the pure strategy 
solution? 





5. What assumptions have to be true for d at R,C to be the pure strategy 
solution? 


6. What assumptions have to be true for there not to be a saddle point 
solution in the game? 
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7. Show the value of the game is 


ad — be 
(a—c)+(d—b)’ 


C= 


8. Set a = 1/4, b = 1/4, c = 2, and d = 0 in Table 7.23. What is the pure 
strategy solution for this game? 


9. Set a= —3, b = 5, c= 4, and d = —3 in Table 7.23. What is the solution? 


10. Let a > d > b> c. Show that Colin should play C1 and C2 with probabil- 
ities x and (1 — x) where 


d—b 
(a—c)+(d—b)’ 


C= 


11. Consider Table 7.24 of a batter-pitcher duel. All the entries in the payoff 
matrix reflect the percent of hits off the pitcher, the batting average. What 
strategies are optimal for each player? 


TABLE 7.24: Batter-Pitcher Duel Payoffs 








Payoff Tableau Pitcher 
Throw Fast Throw Knuckle 
Ball (C1) Ball (C2) 
Guess Fast 
: .1 
Batter | Ball (Ri) 360 T 
Guess Knuckle 

Ball (Rə) -310 -260 








12. Find the solution in the game shown in Table 7.25. 


TABLE 7.25: Two by Three Game 








Colin 
Payoff Tableau 
Cı C2 C3 
Ry 85 45 75 
Rose 
Ro 75 35 35 
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13. Find the solution in the game given in Table 7.26. 


TABLE 7.26: Four by Four Game 








Colin 
Payoff Tableau 

Ch C2 C3 C4 
Ry 40 80 35 60 
Ro 55 90 55 70 

Rose 

R3 55 40 45 75 
Ry 45 25 50 50 








14. Table 7.27 represents a game between a professional athlete (Rose) and 
management (Colin) for contract decisions. The athlete has two strategies 
and management has three strategies. The values are in 1,000s. What 
decision should each make? 


TABLE 7.27: Two by Three Game 


Colin 
Cy Cy C3 


Payoff Tableau 





Ry 490 220 195 
Ro 425 350 150 


Rose 





15. Solve the game given in Table 7.28. 


TABLE 7.28: Game Payoff Table 


Colin 








Rose 





Rə | (3,-3) (4,—4) 


16. The predator has two strategies for catching the prey: ambush or pursuit. 
The prey has two strategies for escaping: hide or run. The game matrix 
appears in Table 7.29. Solve the game. 
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TABLE 7.29: Predator-Prey Payofts 
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Payoff Tableau Predator 
Ambush Pursue 
Ci Cz 
Hide 
Prey Ry 0.26 0.33 








17. A professional football team has collected data for certain plays against 
certain defenses. The payoff matrix in Table 7.30 shows the yards gained 
or lost for a particular play against a particular defense. Find the best 


strategies. 


TABLE 7.30: Yards Gained or Lost 








Payoff Tableau alle 
Cı C2 C3 
Rı 0 —1 5 
Ro 7 5 10 
Team A R 15 —4 —5 
R4 5 0 10 
Rs —5 -10 10 





18. Solve the Jeter-Romero batter-pitcher duel using the payoff matrix given 


in Table 7.31. 
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TABLE 7.31: Jeter-Romero Batter-Pitcher Duel Payofts 


Payoff Tableau Hacky MOm 





Throw Fast Throw Split- 





Ball (C4) Finger (C3) 
Guess Fast 
343 .267 
Derek Jeter Ball (R1) 
Guess Split- 195 Ade 


Finger (R2) 





19. Solve the game with the payoff matrix of Table 7.32. 


TABLE 7.32: Payoff Tableau 








Colin 
Payoff Tableau 
Ci C2 C3 
Ry 0.5 0.9 0.9 
Rose Rə 0.1 0 0.1 
Rs 0.9 0.9 0.5 








20. Solve the game with the payoff matrix of Table 7.33. 


TABLE 7.33: Payoff Tableau 


Colin 
Payoff Tableau |-__________ 





Rose Ry 1 4 
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21. Solve the game with the payoff matrix of Table 7.34. 


TABLE 7.34: Payoff Tableau 





Colin 
Payoff Tableau | —___ 
Ch C2 
Ry 3 —1 
Rose Rə 2 4 








22. Find the solution for the game using the payoff matrix in Table 7.35. 


TABLE 7.35: Payoff Tableau 








Colin 
Payoff Tableau 
Cı C2 C3 C4 
Ry 1 —1 2 3 
Rose 
Rə 2 4 0 5 





23. In Table 7.36 of a batter-pitcher duel, the entries reflect the percent of hits 
off the pitcher, the batting average. Find the optimal strategies for each 
player? 


TABLE 7.36: Batter-Pitcher Duel Payoffs 





Payoff Tableau [| Pitcher 
Throw Fast Throw Curve 
Ball (C1) Ball (C2) 
Guess Fast 
: 2 

Batter Ball (Ry) 300 00 
Guess Curve 

Ball (R2) aog 500 
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24. Solve the game shown in Table 7.37. 


TABLE 7.37: Payoff Tableau 





Colin 
Payoff Tableau | —___ 
Ch C2 
Ry 3 —4 
Rose Rə 1 3 








25. Solve the game shown in Table 7.38. 


TABLE 7.38: Payoff Tableau 








Colin 
Payoff Tableau 
Cı C2 C3 
Ry 1 1 10 
Rose 








26. Solve the game shown in Table 7.39. 


TABLE 7.39: Payoff Tableau 








Colin 
Payoff Tableau 
Cı C2 C3 
Rı 1 2 2 


Rose Ro 2 1 2 








320 


Problem Solving with Game Theory 


27. Determine the solution to the following zero-sum games shown in 
Tables 7.40 to 7.44 by any method or methods, but show/state work. 
State the value of the game for both Rose and Colin, and what strategies 


each player should choose. 


a) 


TABLE 7.40: Payoff Tableau 








Colin 
Payoff Tableau 
Cy C2 C3 
Ry 10 20 14 
Rose Rə 5 21 8 
R3 8 22 0 








TABLE 7.41: Payoff Tableau 





Colin 
Payoff Tableau 
Ch C2 
Ry —8 12 
Rose Rə 2 6 








TABLE 7.42: Payoff Tableau 


Payoff Tableau 


Colin 


Cı C2 





Ri 


Rose 
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d) 


TABLE 7.43: Payoff Tableau 








Colin 
Payoff Tableau 
Ci C2 C3 
Rı 15 12 11 
Rose 
Ro 14 16 17 








TABLE 7.44: Payoff Tableau 








Colin 
Payoff Tableau 
Cı C2 C3 C4 
Ry 3 2 4 1 
Rose Ro —9 1 —1 0 








28. For the game of Table 7.45 between Rose and Colin, write the linear pro- 
gramming formulation for Rose. Using Maple’s simplex or Optimization 
packages, find and state the complete solution to the game in context of 








the game. 
TABLE 7.45: Payoff Tableau 
Payoff Tableau oan 

Ch C2 C3 C4 

Ri 0 1 2 6 

Ro 2 4 1 2 
Rose R3 1 —1 4 —1 

Ry -1 1 —1 3 

Rs —2 —2 2 2 
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Projects 


Project 7.1. Research the solution methodologies for the three-person 
games. Analyze the three-person, zero-sum game between Rose, Colin, and 
Larry shown in Tables 7.46 and 7.47. 


TABLE 7.46: Payoff Tableau for Larry’s Dı 








Colin 
Larry Dı 
Cı C2 
Ry (4, 4, —8) (—2,4, —2) 
Rose 
Re (4, —5,1) (3, —3, 0) 








TABLE 7.47: Payoff Tableau for Larry’s D2 








Colin 
Larry Də 
Ci C2 
Rı (—2,0,2) (—2, —1,3) 
Rose 
Re (—4, 5, -—1) (1,2, —3) 








a) Draw the movement diagram for this zero-sum game. Find any and all 
equilibria. If this game were played without any possible coalitions, what 
would you expect to happen? 

b) If Colin and Rose were to form a coalition against Larry, set up and solve 
the resulting 2 x 4 game. What are the strategies to be played and the 
payoffs for each of the three players? Solve by hand (Show your work!), 
then check using the 3-Person Template. 

c) Given the results of the coalitions of Colin vs. Rose-Larry and Rose 
vs. Colin-Larry as follows: 


Colin vs. Rose-Larry: (3,—3,0) and Rose vs. Colin-Larry: (—2, 0,2) 


If no side payments were allowed, would any player be worse off joining a 
coalition than playing alone? Briefly explain (include the values to justify 
decisions). 

d) What is (are) the preferred coalition(s), if any? 

e) Briefly explain how side payments could work in this game. 
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7.4 Examples of Partial Conflict Games 
We present examples of partial conflict games with their solutions. 


Example 7.9. Cuban Missile Crisis—A Classic Game of Chicken’. 
“We’re eyeball to eyeball, and I think the other fellow just blinked,” were the 
eerie words of Secretary of State Dean Rusk at the height of the Cuban Missile 
Crisis in October, 1962. Secretary Rusk was referring to signals by the Soviet 
Union that it desired to defuse the most dangerous nuclear confrontation ever 
to occur between the superpowers, which many analysts have interpreted as 
a classic instance of a nuclear “Game of Chicken.” 

We will highlight the scenario from 1962. The Cuban Missile Crisis was 
precipitated in October 1962, by the Soviet’s attempt to install medium- and 
intermediate-range nuclear-armed ballistic missiles in Cuba that were capable 
of hitting a large portion of the United States. The range of the missiles from 
Cuba allowed for major political, population, and economic centers to become 
targets. The goal of the United States was immediate removal of the Soviet 
missiles. U.S. policy makers seriously considered two strategies to achieve this 
end: naval blockade or airstrikes. 

President Kennedy, in his speech to the nation, explained the situation 
as well as the goals for the United States. He set several initial steps. First, 
to halt the offensive build-up, a strict quarantine on all offensive military 
equipment under shipment to Cuba was being initiated. He went on to say 
that any launch of missiles from Cuba at anyone would be considered an 
act of war by the Soviet Union against the United States resulting in a full 
retaliatory nuclear strike against the Soviet Union. He called upon Soviet 
Premier Krushchev to end this threat to the world, and restore world peace. 

We will use the Cuban Missile Crisis to illustrate parts of the theory— 
not just an abstract mathematical model, but one that mirrors the real-life 
choices, and underlying thinking of flesh-and-blood decision makers. Indeed, 
Theodore Sorensen, special counsel to President Kennedy, used the language of 
“moves” to describe the deliberations of EXCOM, the Executive Committee of 
key advisors to President Kennedy during the Cuban Missile Crisis. Sorensen 
said, 


We discussed what the Soviet reaction would be to any possible 
move by the United States, what our reaction with them would have 
to be to that Soviet action, and so on, trying to follow each of those 
roads to their ultimate conclusion. 


Problem: Build a mathematical model that allows for consideration of alter- 
native decisions by the two opponents. 





T Adapted from Brams, “Game theory and the Cuban missile crisis,” available at 
http://plus.maths.org/issue13/features/brams/index.html. 
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Assumption: We assume the two opponents are rational players. 
Model Choice: Game Theory. 


The goal of the United States was the immediate removal of the Soviet 
missiles; U.S. policy makers seriously considered two strategies to achieve this 
end: 


1. A naval blockade (B), or "quarantine" as it was euphemistically called, to 
prevent shipment of more missiles, possibly followed by stronger action to 
induce the Soviet Union to withdraw the missiles already installed. 


2. A "surgical" air strike (A) to destroy the missiles already installed, insofar 
as possible, perhaps followed by an invasion of the island. 


The alternatives open to Soviet policy makers were: 
1. Withdrawal (W) of their missiles. 
2. Maintenance (M) of their missiles. 


We set (x, y) as (payoffs to the United States, payoffs to the Soviet Union) 
where 4 is the best result, 3 is next best, 2 is next worst, and 1 is the worst. 
Table 7.48 shows the payoffs. 


TABLE 7.48: Cuban Missile Crisis Payofts 


Soviet Union 
Withdraw Maintain 











Missiles Missiles 
(W) (M) 
United | Blockade (B) (3, 3) (2,4) 
States | Air Strike (A) (4, 2) (1,1) 





We show the movement diagram in Table 7.49 where we have equilibria at 
(4,2) and (2,4). The Nash equilibria are boxed. Note both equilibria, (4,2) 
and (2,4) are found by our arrow diagram. 

As in Chicken, as both players attempt to get to their equilibrium, the 
outcome of the games end up at (1, 1). This is disastrous for both countries and 
their leaders. The best solution is the (3,3) compromise position. However, 
(3,3) not stable. This choice will eventually put us back at (1,1). In this 
situation, one way to avoid the “chicken dilemma” is to try strategic moves. 

Both sides did not choose their strategies simultaneously or independently. 
Soviets responded to our blockade after it was imposed. The U.S. held out 
the chance of an air strike as a viable choice even after the blockade. If the 
U.S.S.R. would agree to remove the weapons from Cuba, the U.S. would agree 
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TABLE 7.49: Cuban Missile Crisis Movement Diagram 


Soviet Union 
Withdraw Maintain 











Missiles Missiles 
(W) (M) 
Blockade (B) (3,3) ==> (2,4) 











In i f 


Air Strike (A) | [(4,2)]— = (1,1) 




















to (a) remove the quarantine and (b) agree not to invade Cuba. If the Soviets 
maintained their missiles, the U.S. preferred the airstrike to the blockade. 
Attorney General Robert Kennedy said, “if they do not remove the missiles, 
then we will.” The U.S. used a combination of promises and threats. The 
Soviets knew our credibility in both areas was high (strong resolve). Therefore, 
they withdrew the missiles, and the crisis ended. Khrushchev and Kennedy 
were wise. 

Needless to say, the strategy choices, probable outcomes, and associated 
payoffs shown in Table 7.48 provide only a skeletal picture of the crisis as it 
developed over a period of thirteen days. Both sides considered more than the 
two alternatives listed, as well as several variations on each. The Soviets, for 
example, demanded withdrawal of American missiles from Turkey as a quid 
pro quo for withdrawal of their own missiles from Cuba, a demand publicly 
ignored by the United States. 

Nevertheless, most observers of this crisis believe that the two superpowers 
were on a collision course, which is actually the title of one book® describing 
this nuclear confrontation. Analysts also agree that neither side was eager to 
take any irreversible step, such as one of the drivers in Chicken might do by 
defiantly ripping off the steering wheel in full view of the other driver, thereby 
foreclosing the option of swerving. 

Although in one sense the United States “won” by getting the Soviets to 
withdraw their missiles, Premier Nikita Khrushchev of the Soviet Union at the 
same time extracted from President Kennedy a promise not to invade Cuba, 
which seems to indicate that the eventual outcome was a compromise of sorts. 
But this is not game theory’s prediction for Chicken because the strategies 
associated with compromise do not constitute a Nash equilibrium. 

To see this, assume play is at the compromise position (3,3), that is, the 
U.S. blockades Cuba, and the U.S.S.R. withdraws its missiles. This strategy 





8Henry M. Pachter, Collision Course: The Cuban Missile Crisis and Coexistence, 
Praeger, NY, 1963. 
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is not stable because both players would have an incentive to defect to their 
more belligerent strategy. If the U.S. were to defect by changing its strategy 
to airstrike, play would move to (4,2), improving the payoff the U.S. received; 
if the U.S.S.R. were to defect by changing its strategy to Maintenance, play 
would move to (2,4), giving the U.S.S.R. a payoff of 4. (This classic game 
theory setup gives us no information about which outcome would be chosen, 
because the table of payoffs is symmetric for the two players. This is a frequent 
problem in interpreting the results of a game theoretic analysis, where more 
than one equilibrium position can arise.) Finally, should the players be at the 
mutually worst outcome of (1,1), that is, nuclear war, both would obviously 
desire to move away from it, making the strategies associated with (1,1), like 
those with (3,3), unstable. 


Example 7.10. Writer’s Guild Strike of 2007—2008. 
Game Theory Approach® 
Let us begin by stating strategies for each side. Our two rational players will 
be the Writer’s Guild and the Management. We develop strategies for each 
player. 

Strategies: 


e Writer’s Guild: Their strategies are to strike (S) or not to strike (NS). 
e Management: Salary Increase and revenue sharing (IN) or status quo (SQ). 


First, we rank order the outcomes for each side in order of preference. (The 
rank orderings are ordinal utilities.) 


Alternatives and Rankings 


e Strike vs. Status Quo = (S, SQ): 
Writer’s worst case (1); Management’s next to best case (3) 


e No Strike vs. Status Quo = (NS, SQ): 
Writer’s next to worst case (2); Management’s best case (4) 


e Strike vs. Salary Increase and Revenue Sharing = (S, IN): 
Writers next to best case (3); Management’s next to worst case (2) 


e No Strike vs. Salary Increase and Revenue Sharing = (NS, IN): 
Writer’s best case (4); Management’s worst case (1) 


This list provides us with a payoff matrix consisting of ordinal values; see 
Table 7.50. We will refer to the Writer’s Guild as Rose and the Management 
as Colin. 

The movement diagram in Table 7.51 finds (2,4) as the likely outcome. 





°From [Fox2008]. 
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TABLE 7.50: Payoff Matrix for Writers and Management 








Management 
(Colin) 
Status Quo Increase Salary 
(SQ) (IN) 
Writer’s A 
Strike (S 1,3 3,2 
ae rike (8) | (1,3) (3,2) 
(Rose) No Strike (NS) (2, 4) (4, 1) 








TABLE 7.51: Writer’s Guild and Management’s Movement Diagram 


Management 
(Colin) 


Increase 
Status Quo Sala 


(SQ) (IN) 








Writer’s Strike (S) (1,3) += (3,2) 
Guild | | 
(Rose) 

No Strike (NS) (2,4) | —— (4,1) 




















The movement arrows point towards (2,4) as the pure Nash equilibrium. 
We also note that this result is not satisfying to the Writer’s Guild, and that 
they would like to have a better outcome. Both (3,2) and (4,1) within the 
payoff matrix provide better outcomes to the Writers. 

The Writers can employ several options to try to secure a better outcome. 
They can first try Strategic Moves, and if that fails to produce a better out- 
come, then they can move to Nash Arbitration. Both of these methods employ 
communications in the game. In strategic moves, we examine the game to see 
if the outcome is changed by “moving first”, threatening our opponent, or 
making promises to our opponent, or whether a combination of threats and 
promises changes the outcome. 

Examine the strategic moves. If the writers move first; their best result is 
again (2,4). If management moves first, the best result is (2,4). First moves 
keep us at the Nash equilibrium. The writers consider a threat: they tell 
management that if they choose SQ, they will strike putting us at (1,3). This 
result is indeed a threat, as it is worse for both the writers and management. 
However, the options for management under IN are both worse than (1,3), so 
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they do not accept the threat. The writers do not have a promise to offer. At 
this point we might involve an arbiter using the Nash arbitration method as 
suggested earlier. 

The Nash arbitration formulation then is 


Maximize Z = (a — 2) - (y — 3) 
subject to 
3 x+y=T 
r>2 
y=3 
Writers and management security levels are found from prudential strate- 


gies using Equations (7.9) and (7.10). The security levels are calculated to be 
(2,3). We show this in Figure 7.4. 








FIGURE 7.4: The Payoff Polygon for Writer’s Guild Strike 


The Nash equilibrium value (2,4) lies along the Pareto Optimal line seg- 
ment (from (4,1) to (2,4)). But the Writers can do better by going on strike 
and forcing arbitration, which is what they did. 

In this example, we consider “binding arbitration” where the players have 
a third party work out the outcomes that best meet their desires and is accept- 
able to all players. Nash found that this outcome can be obtained by: 


The status quo point is formed from the security levels of each side. 
We find the value (2,3) using prudential strategies. The function for 
the Nash Arbitration scheme is Mazimize (a—2)(y—3). Using Maple’s 
QPSolve from the Optimization package, we find the desired solution 
to our quadratic program is x = 2.33 and y = 3.5. 


We have the (a, y) = (2.33, 3.5) as our arbitrated solution. We can now deter- 
mine how the arbiters should proceed. We solve the following simultaneous 
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equations 


2p1 + Ape = 2.33 
4p, + p2 = 3.5. 


We find that the probabilities to be played are 5/6 and 1/6. Further, we see 
that Player I, the Writers, should always play R2, so the management arbiter 
plays 5/6 - Cı and 1/6 - C2 during the arbitration. 


Example 7.11. Game Theory Applied to a Dark Money Network. 
“Dark money” originally referred to funds given to nonprofit organizations 
who can spend the money to influence elections and policy while not having to 
disclose their donors.!° The term has extended to encompass nefarious groups 
seeking to undermine a government. Discussing the strategies for defeating a 
Dark Money Network (DMN) leads naturally into the Game Theory analysis 
of the strategies for the DMN and for the State trying to defeat the DMN. 
When conducting this game theory analysis, we originally limited the analysis 
by using ordinal scaling, and ranking each of the four strategic options one 
through four. The game was set up in Table 7.52. Strategy A is for the state 
to pursue a non-kinetic strategy, B is a kinetic strategy. Strategy C is for the 
DMN to maintain its organization and D is for it to decentralize. 


TABLE 7.52: Dark Money Network 


Payoff DMN 
Tableau c D 





State 








This ordinal scaling worked when allowing communications and strategic 
moves; however, without a way of determining interval scaling it was impos- 
sible to conduct analysis of prudential strategies, Nash Arbitration, or Nash’s 
equalizing strategies. Here we show an application of Analytic Hierarchy Pro- 
cess (AHP) (See Chapter 8) in order to determine the interval scaled payoffs 
of each strategy for both the DMN and the State. We will use Saaty’s standard 
nine point preference in the pairwise comparison of combined strategies. For 
the State’s evaluation criteria, we chose four possible outcomes: how well the 
strategy degraded the DMN, how well it maintained the state’s own ability 
to raise funds, how well the strategy would rally their base, and finally how 
well it removed nodes from the DMN. The evaluation criteria we chose for 





10 Adapted from Couch et al. [CFE2016]. 
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the DMN’s four possible outcomes were: how anonymity was maintained, how 
much money the outcome would raise, and finally how well the DMN’s leaders 
could maintain control of the network. 

After conducting an AHP analysis, we obtained a new payoff matrix in 
Table 7.53 with cardinal utility values. 


TABLE 7.53: Dark Money Network Revised 


Payoff DMN 
Tableau c D 








A | (0.168,0.089) (0.366, 0.099) 
B | (0.140,0.462) (0.323, 0.348) 


State 





With the cardinal scaling, it is now possible to conduct a proper analysis 
that might include mixed strategies or arbitrated results such as finding pru- 
dential strategies, Nash’s Equalizing Strategies, and Nash Arbitration. Using 
a series of game theory solvers developed by Feix!! we obtain the following 
results. 


e Nash Equilibrium: A pure strategy Nash equilibrium was found at (A,D) 
of (0.366, 0.099) using strategies of Non-Kinetic and Decentralize. 


e Mixed Nash Equalizing Strategies: The State plays Non-Kinetic 91.9% 
of the time and Kinetic 8.1% of the time; DMN always plays Maintain 
Organization. 


e Prudential Strategies: The Security Levels are (0.168,0.099) when the 
State plays C and the DMN plays A. 


Since there is no equalizing strategy for the DMN, should the State attempt 
to equalize the DMN. The result is as follows. This is a significant departure 
from our original analysis prior to including the AHP pairwise comparisons. 
The recommendations for the state were to use a kinetic strategy 50% of the 
time and a non-kinetic strategy 50% of the time. However, it is obvious that 
with proper scaling the recommendation should have been to execute a non- 
kinetic strategy the vast majority, 92%, of the time, and only occasionally, 
8%, conduct kinetic targeting of network nodes. This greatly reinforces the 
recommendation to execute a non-kinetic strategy to defeat the DMN. 
Finally, if the State and the DMN could enter into arbitration, the result 
would be at BD, which was the same prediction as before the proper scaling. 





11Feix, Game Theory: Toolkit and Workbook for Defense Analysis Students, Defense 
Tech. Info. Center, 2007. Available at http://www.dtic.mil/dtic/tr/fulltext /u2/a470073.pdf. 
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Example 7.12. Course of Actions Decision Process for Partial Con- 
flict Games. 

Return to the zero-sum game of Example 7.8 of Section 7.3. In many real world 
analyses, a player winning is not necessarily another player losing. Both might 
win or both might lose based upon the mission and courses of action played. 
Again, assume Player I has four strategies and Player II has six strategies that 
they can play. We use an AHP approach (see Fox [Fox2014]) to obtain cardinal 
values of the payoff to each player. Although this example is based on combat 
courses of action, this methodology can be used when competing players both 
have courses of action that they could employ. As in Example 7.11, we use 
AHP to convert the ordinal rankings of the combined COAs in order to obtain 
cardinal values. 


The game’s payoff matrices become 
| > with( Optimization) : 


[> A := Matrix(|[6, 5.75, 5.5, 0.75, 3.75, 0.5], 

4, 4.25, 2.75, 1.75, 2, 0.25], 

3.25, 3, 1.5, 1.25, 1, 3.5], 

5.25, 5, 4.75, 2.5, 2.25, 4.5]]) : 

> B := Matriz({[0.167, 0.333, 0.5, 0.6336, 3.667, 3.833], 
1.5, 1.333, 2.333, 0.8554, 2.833, 4], 

2.1667, 2, 3.1667, 0.3574, 3.5, 1.833], 

| [0.667,0.833, 1, 5.95, 2.667, 1.1667]] : 

[> X := Vector|row](4, symbol = x) : 

| Y := Vector(6, symbol = y) : 

| > Constraints := {seq((A.Y); < p,i=1..4), 

seq((X .B); < q, i = 1.4), 

add(a;,i = 1..4) = 1, add(y;,i = 1..6) = 1} 

| > Objective := exrpand(X.A.Y+X.B.Y—p-—q): 








The solutions, depending on the starting conditions for P and Q, are found 
using Maple’s QPSolve since Objective is quadratic or NLPSolve since it is 
nonlinear. As before, use fnormal to eliminate numerical artifacts. 


| > theProgram := (Objective, Constraints, assume = nonnegative, 
L maximize) : 
| > fnormal(QPSolve(theProgram), 3); 
(3.08, [p = 2.86, q = .634, 2, = In z3 =0.,23 = 0.,24 = 0.,y1 = 0., yo = 0., 
y3 = 0.,y4 = 0., y5 = 0.727, yg = 0.273] 














> fnormal( NLPSolve(theProgram, initialpoint = {p = 3,q = 6}), 3); 
(0., [p= 2.50,q = 5.95, £1 0., £2 0., £3 0., £4 lyi 0., yo 0:, 
Y3 0., Ya 1., Ys 0., Y6 0.]] 
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| > fnormal( NLPSolve(theProgram, initialpoint = {p = 2,q = 1}),3); 


(3.08, [p = 2.86, q = 0.634, xı L., T2 0., £3 0., T4 0., Y1 0., Yo 0., 

















We find Player I should play a pure strategy of COA 4, while Player 
2 should play either a mixed strategy of ys = 8/11 and yg = 3/11, or a 
pure strategy of always y4. We also employed sensitivity analysis by varying 
the criteria weights which represent the cardinal values. We found not much 
change in the results from our solution analysis presented here. 








Exercises 


Solve the problems using any method. 
1. Find all the solutions for Table 7.54. 


TABLE 7.54: U.S. vs. State-Sponsored Terrorism 








Payoff State Sponsor 
Tableau Sponsor Stop Sponsoring 
Terrorism Terrorism 
me Strike Militarily (2, 4) (1.5, 0) 
~ Do Not Strike 
Militarily (1,1.5) (0,4) 








2. Find all the solutions for Table 7.55. 


TABLE 7.55: U.S. vs. State-Sponsored Terrorism 


Payoff State Sponsor 





Tableau Sponsor Stop Sponsoring 
Terrorism Terrorism 


U.S. | Strike Militarily (3, 5) (2,1) 
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3. Find all the solutions for Table 7.56. 
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TABLE 7.56: U.S. vs. State-Sponsored Terrorism 








Payoff Colin 
Tableau . 
Arm Disarm 
Arm (2, 2) (1,1) 
Rose 
Disarm (4, 1) (3,3) 








4. Consider the following classical games of Chicken in Tables 7.57 to 7.59. 


Find the solutions. 














(a) 
TABLE 7.57: A Classical Game of Chicken I 
Payoff Colin 
Tableau c D 
A 3,3 2,4 
eae (3, 3) (2,4) 
1 
(b) 


TABLE 7.58: A Classical Game of Chicken II 





Payoff Colin 
Tableau c D 
A (2,3) (4,1) 


Rose 
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TABLE 7.59: A Classical Game of Chicken III 








Payoff Colin 
Tableau C D 
A 4,3 3,4 
Rose l ) (3.4) 








5. Consider the classical games of Prisoner’s Dilemma in Tables 7.60 and 


7.61. Find the solutions. 


(a) 


TABLE 7.60: A Classical Game of Prisoner’s Dilemma I 








Payoff Colin 
Tableau c D 
A (3,3)  (=1,5) 
Rose 








(b) 


TABLE 7.61: A Classical Game of Prisoner’s Dilemma II 








Payoff Colin 
Tableau c D 
A (3,3) (1,5) 


Rose 
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Projects 


Project 7.1. Corporation XYZ consists of Companies Rose and Colin. Com- 
pany Rose can make Products Rı and R2. Company Colin can make Products 
Cı and C2. These products are not in strict competition with one another, but 
there is an interactive effect depending on which products are on the market 
at the same time as reflected in Table 7.62 below. The table reflects profits in 
millions of dollars per year. For example, if products Rə and C4 are produced 
and marketed simultaneously, Rose’s profits are 4 million and Colin’s 5 million 
annually. Rose can make any mix of R; and Ry, and Colin can make any mix 
of Cı and Cz. Assume the information below is known to each company. 
NOTE: The CEO is not satisfied with just summing the total profits. He 
might want the Nash Arbitration Point to award each company proportion- 
ately based on their strategic positions, if other options fail to produce the 
results he desires. Further, he does not believe a dollar to Rose has the same 
importance to the corporation as a dollar to Colin. 


TABLE 7.62: Corporate Payoff Matrix 


Payoff Company Colin 





Tableau Cı G; 





Company Rı (3,7) (8,5) 
Rose | Ra| (4,5) (5,6) 








a. Suppose the companies have perfect knowledge and implement market 
strategies independently without communicating with one another. What 
are the likely outcomes? Justify your choice. 

b. Suppose each company has the opportunity to exercise a strategic move. 
Try first moves for each player; determine if a first move improves the 
results of the game. 

c. In the event things turn “hostile” between Rose and Colin, find, state, 
and then interpret 

i. Rose’s Security Level and Prudential Strategy. 
ii. Colin’s Security Level and Prudential Strategy. 


Now suppose that the CEO is disappointed with the lack of spontaneous 
cooperation between Rose and Colin, and decides to intervene and dictate 
the “best” solution for the corporation. The CEO employs an arbiter 
to determine an “optimal production and marketing schedule” for the 
corporation. What is this strategy? 
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d. Explain the concept of “Pareto Optimal” from the CEO’s point of view. 
Is the “likely outcome” you found in question (1) at or above Pareto 
Optimal? Briefly explain and provide a payoff polygon plot. 

e. Find and state the Nash Arbitration Point using the security levels found 
above. 


f. Briefly discuss how you would implement the Nash Point. In particular, 
what mix of the products Rı and Rə should Rose produce and mar- 
ket, and what mix of the products C1 and C2 should Colin produce? 
Must their efforts be coordinated, or do they simply need to produce the 
“optimal mix”? Explain briefly. 

g. How much annual profit will Rose and Colin each make when the CEO’s 
dictated solution is implemented? 


E: eee 


7.5 Conclusion 


We have presented some basic material concerning applied game theory and 
its uses in business, government, and industry. We presented some solution 
methodologies to solve these simultaneous total- and partial-conflict games. 
For analysis of sequential games, games with cooperation, and N-person 
games, please see the additional readings. 
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Introduction to Problem Solving with 
Multi-Attribute Decision Making 





Objectives: 


(1) Know the types of multi-attribute decision techniques. 


(2) Understand the different weighting schemes and how to imple- 
ment them. 


(3) Know which technique or techniques to use. 


(4) Know the importance of sensitivity analysis. 


(5) Recognize the importance of technology in the solution process. 


The Department of Homeland Security (DHS) has a limited number of 
assets and a finite amount of time to conduct investigations, thus action pri- 
orities must be established. The Risk Assessment Office has collected the data 
shown in Table 8.1 for the morning Daily Briefing. Your Operations Research 
Team must analyze the information and provide a priority list to the Risk 
Assessment Team in time for the briefing. 


TABLE 8.1: DHS Risk Assessment Data 


Approx. Damage 


Pop. 














ey Reliability deaths Estimate Density Don unh 
NATURE Assess. (x108) (x106) (x108) Factor Intel. Tips 
Dirty Bomb 0.40 10 150 4.5 9 3 
Bio-Terror 0.45 0.8 10 3.2 7.5 12 
“DC Roads | nas anw bnn. nee a o « 
S 0.35 0.005 300 0.85 6 8 
NY Subway 0.73 12 200 6.3 7 5 
DC Metro 0.69 11 200 2.5 T 5 
Major Bank 
__ Robbery 0.81 0.0002 10 0.57 2 = a 
Air Traffic 0.70 0.001 5 015 45 15 
Control 
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(Note: Psych-factor = Destructive psychological influence of the act.) 
TASK. Build a model that ranks the incidents in a priority order. 


ASSUMPTIONS. The main suppositions are: 
e Past decisions will give insights into the decision maker’s process. 


e Table 8.1 holds the only data available: reliability, approximate number of 
deaths anticipated, approximate remediation costs, location of the action, 
destructive psychological influence on the citizenry, and the number of 
intelligence tips gathered. 


e The listed factors will form the criteria for the analysis. 
e The data is accurate and precise. 


The problem will be solved with the SAW (exercise) and TOPSIS methods. 





8.1 Introduction 


Multiple-attribute decision making (MADM) concerns making decisions when 
there are multiple, but finite, alternatives and criteria. This topic is sometimes 
called multi-criteria decision analysis or MCDA. These problems differ from 
analysis where we have only one criteria such as cost with several alternatives. 
We address problems such as in the DHS scenario where there are six criteria 
with seven alternatives that impact the decision. 

Consider a problem where management needs to prioritize or rank order 
alternative choices such as: identifing key nodes in a supply chain, choosing 
a contractor or sub-contractor, selecting airports, ranking recruiting efforts, 
ranking banking facilities, ranking schools or colleges, etc. How can setting 
relative priorities or choosing rank orders be accomplished analytically? 

We will present four methodologies for prioritizing or rank ordering alter- 
natives based upon multiple criteria. The methodologies are 


e Data Envelopment Analysis (DEA) 
e Simple Average Weighting (SAW) 
e Analytical Hierarchy Process (AHP) with Objective Data! 


e Technique of Order Preference by Similarity to Ideal Solution (TOPSIS) 





‘Our discussion of AHP will be restricted to data that is real, and not subjective. For 
further study, see Saaty’s Fundamentals of Decision Making and Priority Theory With the 
Analytic Hierarchy Process. 
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For each technique, we describe its methodology, discuss strengths and lim- 
itations, offer tips for conducting sensitivity analysis, and present illustrative 
examples using Maple. 





8.2 Data Envelopment Analysis 


Charnes, Cooper, and Rhodes [CharnesCooperRhodes1978] described data 
envelopment analysis (DEA) as a mathematical programming model applied 
to observational data, providing a new way of obtaining empirical estimates of 
relationships among decision making units (DMUs) that take multiple inputs 
and produce multiple outputs. They were inspired by “relative efficiency” in 
combustion engineering. The definition of a DMU is generic and very flexible. 
Any object to be ranked can be a DMU, from individuals to government min- 
istries. DEA has been formally defined as a methodology directed to frontier 
analysis, rather than to central tendencies. The technique is a “data input- 
output driven” approach for evaluating the relative performance of DMUs. 
DEA has been used to evaluate the performance or efficiencies of hospitals, 
schools, departments, university faculty, US Air Force Wings, armed forces 
recruiting agencies, universities, cities, courts, businesses, banking facilities, 
countries, regions, SOF airbases, key nodes in networks, ...; the list goes on. 
A Google Scholar search on “data envelopment analysis” returns over 330,000 
results in under 0.1 seconds. According to Cooper ([CooperLSTTZ2001], the 
first item returned by the search), DEA has been used to gain insights into 
activities that were not able to be obtained by other quantitative or qualitative 
methods. 





Data Envelopment Analysis as a Linear Program 


A DEA model, in simplest terms, may be formulated and solved as a linear pro- 
gramming problem (Winston [Winston2002], Callen [Callen1991]). Although 
there are several representations for DEA, we’ll use the most straightforward 
formulation for maximizing the efficiency of the kth DMU as constrained by 
inputs and outputs (shown in (8.1)). As an option, we may wish to normal- 
ize the metric inputs and outputs for the alternatives if the values are poorly 
scaled within the data. We will call this data matrix X with entries xij. Define 
each DMU or efficiency unit as E; for i = 1,2,...,n for n DMUs. Let w; be 
the weights or coefficients for the linear combinations. Further, restrict any 
efficiency value from being larger than one (100%). Thus, the largest efficient 
DMU will have efficiency value 1. These requirements give the following linear 
programming formulation for DMUs with multiple inputs yielding a single 
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output. 
Maximize Ek 
subject to (8.1) 
X wizy- Ei=0, j=1,2,... 
i=1 
E;<1 for alli 


For multiple inputs and outputs, we recommend using (8.2), the formulations 
provided by Winston ([Winston2002]) and Trick ([Trick1996]). Let X; be the 
inputs array and Y; be the outputs array for DMU;. Let Xo and Yo be for 
DMUp, the DMU being modeled, then 


Minimize 0 
subject to (8.2) 
5 ài Xi < 0X0 
i=1 


3 MYi < Yo 
= 


A; > 0 for alli 


Strengths and Limitations of DEA 


DEA is a very useful tool when used wisely [Trick1996]. Strengths that make 
DEA very useful include: 
1. DEA can handle multiple input and multiple output models; 


2. DEA doesn’t require an assumption of a functional form relating inputs 
to outputs; 


3. DMUs are directly compared against a peer or combination of peers; and 
4. inputs and outputs can have very different units. 


For example, X, could be in units of lives saved, while X> could be in units 
of dollars spent without requiring any a priori tradeoff between the two. 

The same characteristics that make DEA a powerful tool can also create 
limitations. An analyst should keep these limitations in mind when choosing 
whether or not to use DEA. Limitations include: 


1. DEA is an extreme point technique, thus noise in the data, such as mea- 
surement error, can cause significant problems. 
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2. DEA is good at estimating relative efficiency of a DMU, but it does not 
directly measure absolute efficiency. In other words, DEA can show how 
well a DMU is doing compared to its peers, but not compared to a theo- 
retical maximum. 


3. Since DEA is a nonparametric technique, statistical hypothesis tests are 
difficult—they are the focus of ongoing research. 


4. Since a standard formulation of DEA with multiple inputs and outputs 
creates a separate linear program for each DMU, large problems can be 
computationally extremely intensive. 


5. Linear programming does not ensure all weights are considered. We find 
that the values for weights are only for those that optimally determine an 
efficiency rating. If having all criteria (all inputs and outputs) weighted is 
essential to the decision maker, then DEA is not appropriate. 


Sensitivity Analysis 


Sensitivity analysis is always an important element in every modeling project. 
According to Nerali ([Neralic1998]), an increase in any output cannot worsen 
an efficiency rating, nor can a decrease in inputs alone worsen an already 
achieved efficiency rating. As a result, in our examples we only decrease out- 
puts and increase inputs. We will briefly illustrate sensitivity analysis, as appli- 
cable, in the examples. 


Example 8.1. Manufacturing Units. 

A manufacturing process involves three DMUs each having two inputs and 
three outputs. Management wishes to assess the efficiency of each DMU 
in order to target resources to improve performance. The data appears in 
Table 8.2. 


TABLE 8.2: Manufacturing DMU Data 


DMU | Input 1 Input 2 | Output 1 Output 2 Output 3 





I 5 14 9 4 16 
II 8 15 5 7 10 
Il 7 12 4 9 13 





Since no units are given and the values have similar scales, the data doesn’t 
have to be normalized. 
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Define the variables 


ti = value of a single unit of output of DMU; 
w; = cost or weights for one unit of inputs to DMU; 
X = matrix of input data 
Y = matrix of output data 
DMU; = objective function for DMU;,’s linear program 
Eff; = relative efficiency of DMU;, with a vector of weights 


all for i = 1, 2, and 3. 
Assume that 


e No DMU can have an efficiency of more than 100%. 


If any efficiency is less than 100%, then that DMU is inefficient. 


The costs are scaled so that the costs of the inputs equals 1 for each linear 
program. For example, use 5w, + 14w2 = 1 in the LP for DMU}. 


All values and weights must be strictly positive. (We may have to use a 
constant such as 0.0001 in lieu of 0 in inequalities to help numeric routines 
converge.) 


To calculate the efficiency of DMU;, use the linear program 


Maximize DMU, = 9t; + 4t2 + 16t3 
subject to 
—9t, — 4tə — 16t3 + 5w1 + l4w2 > 0, 
5t, — 7tz — 10t3 + 8w1 + 15w2 > 0, 
4t, — 9t2 — 13t3 + Tw, + 12w2 > 0, 
5w, + 14w = 1, 
t;,w; =O for all z, j. 














To calculate the efficiency of DMUs, use the linear program 


Maximize DMU% = 5t, + Tt2 + 10t3 
subject to 
—9t, — 4to — 16t3 + 5w1 + l4w2 > 0, 
5t, — 7tz — 10t3 + 8w1 + 15w2 > 0, 
—4t, — 9tə — 13t3 + Tw, + 12w2 > 0, 
8w, + 15w2 = 1, 
t;,w; >0 for all z, j. 
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To calculate the efficiency of DMUs, use the linear program 


Maximize DMU3 = 4t, + 9t2 + 13t3 
subject to 
—9t, — 4t — 16t3 + 5w1 + 14w > 0, 
5t, — Tta — 10t3 + 8w, + 15w2 È 0, 
At, — 9t2 — 13t3 + Tw, + 12we > 0, 
Twi + 12w2 = 1, 
t wj =O for all i, j. 














Use Maple to solve the three linear programs. 


| > with( Optimization) : 





Define the input and output data matrices. Using the Matrix Palette makes 
entry easier and less error prone. Set the size, then click Insert Matriz. 


5 14 
> Inputs := |8 15 
7 12 
9 4 16 
Outputs := |5 7 10 
4 9 13 


Define the decision variables and weights. 

[> T := (t;$i = 1..3) : 

| W := (w;$i = 1..2) : 

Compute the LP’s objective functions. 

[> DMUObj := Outputs . T; 
9tı + 4t2 + 16 t3 
5t, +7tz + 10 t3 
4tı +9t2 + 13 t3 











Set up the constraints. 


[> Inputs.W — Outputs.T >~ 0; # ‘>~ 0’ applies œ 0’ element-wise. 
MainConstraints := convert(%, set) : 
0 < 5w T 14 w2 == 9 tı = 4tə = 16 t3 


0 < 8 wi Tr 15 we — 5t = T tə = 10 t3 
0 < Tw a 12 we — 4tı — 9tə = 13 t3 
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[> Inputs. W=~1: # ‘=~ 1 applies = 1’ to each element. 
DMUcons := convert(%, list) : 
5wı +14w2 =1 


Swi = 15 w2 =] 
L Twi + 12w = 1 


Now solve the three linear programs. 








[> NumDMU :=3: 

for i to NumDMU do 
Constraints; := MainConstraints union {DMUcons;} : 
LPSolve( DMUObj;, Constraints;, assume = nonnegative, 














maximize) : 
Eff; := fnormal(%, 4) : 
end do: 
Matrix(convert( Eff, list)); 
1.0 [ty = 0.0, t2 = 0.0, tz = 0.06250, w, = 0.0, wz = 0.07143] 
0.7733 [tı = 0.08000, tp = 0.05333, t3 = 0.0, w = 0.0, w2 = 0.06667] 
1.0 [ty = 0.0, t2 = 0.0, t3 = 0.07065, w, = 0.0, w2 = 0.08333] 





The linear program solutions show the relative efficiencies of DMU: and DMU3 
are 100%, while DMU9’s is 77.3%. 


INTERPRETATION. DMU; is operating at 77.3% of the efficiency of DMU; and 
DMU3. Management could concentrate on improvements for DMU3 by taking 
best practices from DMU; or DMU3. 

To compute the shadow prices for the linear programs, paying special 
attention to those of DMU», solve the dual LPs. Maple’s dual command in the 
simplex package does not handle equality constraints, so replace all equalities 
with two inequalities < and >. 


|> for i to NumDMU do 
DualConstraints; := MainConstraints union 
{1 < lhs(DMUcons;),1 > lhs(DMUcons;)} : 
DualLP, := simplex:-dual(DMUObj,, DualConstraints;, A) : 
LPSolve( DualL P;, assume = nonnegative) : 
DuadlEff , := fnormal(%, 4) : 


end do: 
Matrix(convert(DualEff, list)); 
1.0 [Al = 0.0, A2 = 1.0, 13 = 0.0, 4 = 0.0, A5 = 1.0] 


0.7733 [Al = 0.0, A2 = 0.7733, \3 = 0.0, A4 = 0.6615, A5 = 0.2615] 
1.0 [A1 = 0.0, A2 = 1.0, 43 = 0.0, \4 = 1.0, A5 = 0.0] 
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Examining the shadow prices from the dual linear program for DMU2 


shows A5 = 0.26, A4 = 0.66, and A3 = 0. The average output vector for 
DMUs can be written as 


9 4 5 
0.26 | 4 | +0.66 | 9| = 7 : 
16 13 12.785 


and its average input vector is 


5 7 5.94 
0.26 H + 0.66 3 = ie : 
Output 3 in Table 8.2 is 10 units. Thus, the inefficiency is in Output 
3 where 12.785 units are required. We find that they are short 2.785 units 


(= 12.785 —10). This calculation helps focus on treating the inefficiency found 
for Output 3. 


SENSITIVITY ANALYSIS. In linear programming, sensitivity analysis is some- 
times referred to as “what if” analysis. Assume that without management 
providing some additional training, DMU>2’s Output 3 value dips from 10 to 
9 units, while Input 2 increases from 15 to 16. We find that these changes in 
the technology coefficients are easily handled when re-solving the LPs. Since 
DMUs; is affected, we might only modify and solve the LP for DMU2. With 
these changes, DMUy’s efficiency is now only 74% of DMU, or DMU3. 


Example 8.2. Ranking Five Departments in a College. 

Five science departments in the College of Arts & Sciences are scheduled for 
review. The dean has provided the data in Table 8.3 and asked for relative 
efficiency ratings.” 


TABLE 8.3: Arts & Sciences Department’s Data 














Inputs Outputs 
Department 
No. Faculty | Student Cr. Hr. No. Students Total Degrees 
Biology 25 18,341 9,086 63 
Chemistry 15 8,190 4,049 23 
Comp. Sci. 10 2,857 1,255 31 
Math. 33 22,277 6,102 31 
Physics 12 6,830 2,910 19 
2 Adapted from [Bauldry2009a]. 
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Since the data values differ by orders of magnitude, divide both student 
credit hours and number of students by 1,000. 
Follow the same sequence of Maple commands as in the previous example. 


[> Inputs := Matrix(5, 1, [25, 15, 10, 33, 13]) : 
18.341 9.086 63 


8.190 4.049 23 
Outputs := | 2.857 1.255 31 
22.277 6.102 31 


L 6.830 2.910 19 

|> T := (t;$i = 1..3)) : 

| W := (w;$i = 1..1)) : 

[> DMUObj := Outputs . T; 

[ 18.341 tı + 9.086 t2 + 63 t3 
8.190 t1 + 4.049 tz + 23 t3 
2.857 tı + 1.255 tz + 31 t3 
22.277 tı + 6.102 to + 31 t3 


i | 6.8304, + 2.910 to + 19 ts 

[> Inputs. W — Outputs.T >~ 0; 

MainConstraints := convert(%, set) : 

0 < 25 w, — 18.341 tı — 9.086 t2 — 63 t3 
0 < 15w — 8.190 tı — 4.049 t2 — 23 t3 
0 10 LO = 2.857 tı — 1.255 t2 — 31 t3 


| | 0 < 13w — 6.8304; — 2.910 ty — 19t3 
[> Inputs. W =~ 1; 
DMUcons := convert(%, list) 


t 

















25w, = 1 
15wy=1 
10w; =1 
33w, =1 
13w: =1 
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[> NumDMU :=5: 
for i to NumDMU do 
Constraints := MainConstraints union {DMUcons;} : 
LPSolve( DMUObj;, Constraints, assume = nonnegative, 
maximize) : 
Eff; := fnormal(%,7) : 
end do: 
Matrix(convert( Eff, list)); 
1.0 [1 = 0.01492614, t = 0.0, t3 = 0.01152761, w, = 0.04000000] 
0.7442342 [tı = 0.09087109, t2 = 0.0, t3 = 0.0, wi = 0.06666667] 
1.0 [tı = 0.0, t2 = 0.0, t3 = 0.03225806, wi = 0.1000000] 
0.9201524 [tı = 0.04130504, t2 = 0.0, t3 = 0.0, wi = 0.03030303] 
0.7161341 [tı = 0.1048513, t2 = 0.0, t3 = 0.0, wi = 0.07692308] 





We see the DMUs are ranked as: Biology and Computer Science: 100%; Math- 
ematics: 92%; Chemistry: 74%; and Physics: 71%. 
Examine the results from the dual LPs. 


|> for i to NumDMU do 
DualConstraints := MainConstraints union 
{ihs(DMUcons;) < 1, lhs(DMUcons;) > 1}: 
DualLP, := simplex:-dual(DMUObj,, DualConstraints, A) : 
LPSolve( DualLP, assume = nonnegative, 
DualEff , := fnormal(%, 7) : 
end do: 
Matrix(convert(DualEff, list)); 
1.0 [A1 = 0.0, A2 = 1.0, A3 = 0.0, A4 = 1.0, A5 = 0.0, AG = 0.0, A7 = 0.0] 


0.744 [\l = 0.0, A2 = 0.744, \3 = 0.0, 4 = 0.447, \5 = 0.0, AG = .0, A7 = 0.0] 
1.0 [A1 = 0.0, A2 = 1.0, A3 = 0.0, 44 = 0.0, A5 = 0.0, A6 = 0.0, A7 = 1.0] 
0.920 [A1 = 0.0, A2 = 0.920, \3 = 0.0, A4 = 1.215, \5 = 0.0, A6 = .0, A7 = 0.0] 
0.716 [Al = 0.0, A2 = 0.716, \3 = 0.0, \4 = 0.372, A5 = 0.0, AG = 0.0, A7 = 0.0] 



































Comparing the values of the As, improving Output 2, Numbers of Students, 
will provide the largest gains in efficiency for both Chemistry and Physics. 





Exercises 


1. Table 8.4 lists data for three hospitals where inputs are number of beds 
and labor hours in thousands per month, and outputs, all measured in 
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hundreds, are patient-days for patients under 14, between 14 and 65, and 
over 65. Determine the relative efficiency of the three hospitals. 


TABLE 8.4: Three Hospitals’ Data 








i Inputs Outputs 
Hospital 
No. Beds Labor Hr. | <14 14-65 >65 
I 5 14 9 4 16 
II 8 15 5 7 10 
Il 7 12 4 9 13 








2. The three hospitals of Exercise 1 have revised procedures. Reanalyze their 
relative efficiencies using the new data of Table 8.5. 


TABLE 8.5: Three Hospitals’ Revised Data 








; Inputs Outputs 
Hospital 
No. Beds Labor Hr. | <14 14-65 >65 
I 4 16 6 5 15 
II 9 13 10 6 9 
I 5 11 5 10 12 








3. The First National Bank of Spruce Pine, NC, has four branches in the 
greater Spruce Pine metropolitan area. The CEO directed an efficiency 
study be undertaken. The data to be collected is: 

INPUT 1: labor hours (hundred per month) 
INPUT 2: space used for tellers (hundreds of square feet) 


INPUT 3: supplies used (dollars per month) 


OUTPUT 1: loan applications per month 
OUTPUT 2: deposits (thousands of dollars per month) 


OUTPUT 3: checks processed (thousands of dollars per month) 


The data for the bank branches appears in Table 8.6. 
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TABLE 8.6: Bank Branch Data 








Inputs Outputs 
Branch 
Labor Hr. Space. Supplies | Loans Deposits Checks 
I 15 20 50 200 15 35 
II 14 23 51 220 18 45 
Ill 16 19 51 210 17 20 
IV 13 18 49 199 21 35 








(a) Determine the branches’ relative efficiencies. 


(b) What “best practices” might you suggest to the branches that are 
less efficient? 








8.3 Simple Additive Weighting 


Simple Additive Weighting (SAW) was developed in the 1950s by Churchman 
and Ackoff [ChurchmanAckoff1954]; it is the simplest of the MADM methods, 
yet still one of the most widely used since it is easy to implement. SAW, also 
called the weighted sum method [Fishburn1967], is a straightforward and easily 
executed process. 


Methodology 


Given a set of n alternatives and a set of m criteria for choosing among the 
alternatives, SAW creates a function for each alternative rating its overall 
utility. Each alternative is assessed with regard to every criterion (attribute) 
giving the matrix M = [mj] where m;j is the assessment of alternative i 
with respect to criterion j. Each criterion is given a weight w;; the sum of 
all weights must equal 1; i.e., $`; w; = 1. If the criteria are equally weighted, 
then we merely need to sum the alternative values. The overall or composite 
performance score P; of the ith alternative with respect to the m criteria is 
given by 


1 m 
j=1 


for i=1,...,n. Write P = [P,] using matrix/vector notation as 


1 
m 


P=- Mü. 
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where w = [w;]. The alternative with the highest value of P; is the best relative 
to the chosen criteria weighting. 

Originally, all the units of the criteria had to be identical, such as dol- 
lars, pounds, seconds, etc. A normalization process making the values unitless 
relaxes the requirement. We recommend always normalizing the data. 


Strengths and Limitations 


The main strengths are (1) ease of use, and (2) normalized data allows for com- 
parisons across many differing criteria. Limitations include “larger is always 
better” or “smaller is always better.” The method lacks flexibility in stating 
which criterion should be larger or smaller to achieve better performance, 
thus making gathering useful data with the same relational schema (larger or 
smaller) essential. 


Sensitivity Analysis 


Sensitivity analysis should be used to determine how sensitive the model is to 
the chosen weights. A decision maker can choose arbitrary weights, or, choose 
weights using a method that performs pairwise comparisons (as done with the 
analytic hierarchy process discussed later in this chapter). Whenever weights 
are chosen subjectively, sensitivity analysis should be carefully undertaken. 
Later sections investigate techniques for sensitivity analysis that can be used 
for individual criteria weights. 


Examples of SAW 


A Maple procedure, SAW, to compute rankings using simple additive weight- 
ing is included in the book’s PSMv2 package. The parameters are the matrix 
of criteria data for each alternative and the weights, either as a vector or a 
comparison matriz. 


> with(PSMv2) : 
Describe(SAW); 
# Usage: SAW(M:data matrix [rows:alternatives, columns:criteria], 
# weights: vector or comparison matrix) 
SAW( AltM::Matrix, w ) 





To examine the program’s code, use print( SAW). 


Example 8.3. Selecting a Car. 

It’s time to purchase a new car. Six cars have made the final list: Ford Fusion, 
Toyota Prius, Toyota Camry, Nissan Leaf, Chevy Volt, and Hyundai Sonata. 
There are seven criteria for our decision: cost, mileage (city and highway), 
performance, style, safety, and reliability. The information in Table 8.7 has 
been collected online from the Consumer’s Report and US News and World 
Report websites. 
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Cars Cost l MPG Perfor- Interior Safety Reliability 
($1000) City Highway mance & Style 

Prius 27.8 44 40 7.5 8.7 9.4 3 
Fusion 28.5 47 47 8.4 8.1 9.6 4 

Volt 38.7 35 40 8:2 6.3 9.6 3 
Camry 25.5 43 39 7.8 7.5 9.4 5 
Sonata 27.5 36 40 7.6 8.3 9.6 5 

Leaf 36.2 40 40 8.1 8.0 9.4 3 

Initially, we assume all criteria are weighted equally to obtain a baseline 





ranking. Even though the different criteria values are relatively close, let’s 
normalize the data for illustration. There are three typical methods to use: 


Mij = Mij 


M; m;’ Nij => rank; (mj) 


Mij 
Mij > 


ja 
» Mij > 
M; 


where Mj and m; are the maximum and minimum values in the jth column, 
and rank, is the rank order of the jth column. Our SAW program from the 
PSMv2 package uses the first method. 

We’ll exclude the cost data from our first baseline ranking since larger 
cost is worse, whereas all the other data has larger is better. SAW requires 
consistent criteria ranking. 


[> CarModel := Vector[row]([Prius, Fusion, Volt, Camry, Sonata, Leaf]) : 


44 40 7.5 8.7 94 3 
47 47 84 81 9.6 
35 40 8.2 63 9.6 
43 39 7.8 7.5 9.4 
36 40 7.6 8.3 9.6 


36.2 40 8.1 8.0 9.4 
| Price := (27.8, 28.5, 38.7, 25.5, 27.5, 36.2) : 
| > equalWeights := (1/6.$6) : 


> CarData := 


w a Th WwW A 





The SAW function will return a matrix of rankings. The first row is a raw 
ranking, the second row is normalized so the largest ranking is 1. We’ll apply 
fnormal to the rankings and add a legend to make it easier to read the results. 
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[> SAW(CarData, equalWeights) : 
(CarModel, fnormal(%, 3)); 


Prius Fusion Volt Camry Sonata Leaf 
0.877 0.955 0.816 0.919 0.913 0.847 
L 0.918 1.0 0.854 0.962 0.955 0.887 


Our rank ordering with equal weighting is Fusion, Camry, Sonata, Prius, Volt, 
and Leaf. 

Let’s add cost to the ranking. In order to match “larger is better,” invert 
cost by ci > 1/c;, then append this criterion to the data. 





|> PriceR := Price~ C’ : # the ‘~’ applies the power to each element 


|> (CarModel, fnormal(SAW ((PriceR | Car), (1./7$7)), 3)); 
Prius Fusion Volt Camry Sonata Leaf 


0.882 0.947 0.794 0.931 0.915 0.827 
L 0.932 1.0 0.838 0.983 0.966 0.874 


Adding cost to the criteria considered changed the ranking to Fusion, Camry, 
Sonata, Prius, Leaf, and Volt. Only Leaf and Volt changed places. Should cost 
weigh more than other criteria? 

A weighting vector can be created from pairwise preference assessments. 
This technique was introduced by Saaty in 1980 when he developed the ana- 
lytic hierarchy process that we’ll study in Section 8.4. Decide which item of 
the pair is more important and by how much using the scale of Table 8.8. If 





TABLE 8.8: Saaty’s Nine-Point Scale 


Importance level | Definition 





1 Equal importance 
Intermediate 

Moderate importance 
Intermediate 

Strong importance 
Intermediate 

Very Strong importance 


Intermediate 





Oo ONO TF Ww Dn 


Extreme importance 
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Criterion, is k times as important as Criterion,;, then Criterion, is 1/k, the 
reciprocal, times as important as Criterion;. 
Begin with comparing cost to the other criteria. 


to | Cost City Hwy Perform. Style Safety Reliable 
Cost 1 4 5 3 7 2 3 





Make all the pairwise assessments creating an upper triangular comparison 
matriz. 


to Cost City Hwy Perform. Style Safety Reliable 
Cost 1 4 5 3 T 2 3 
City 1 3 1/2 6 1/3 1/3 
Hwy 1 1/4 3 1/4 1/4 
Perform. 1 6 1/2 1/2 
Style 1 1/6 1/5 
Safety 1 2 
Reliable 1 





Note that if cost is 3 (weakly more important) to performance, then perfor- 
mance is 1/3, the reciprocal value, to cost. Use reciprocals to fill in the lower 
triangle of the matrix, so that if mi; = k, then mji = 1/k. 





to Cost City Hwy Perform. Style Safety Reliable 
Cost 1 4 5 3 7 2 3 
City 1/4 1 3 1/2 6 1/3 1/3 
Hwy 1/5 1/3 1 1/4 3 1/4 1/4 
Perform. | 1/3 2 4 1 6 1/2 1/2 
Style 1/7 1/6 1/3 1/6 1 1/6 1/5 
Safety 1/2 3 4 2 6 1 2 
Reliable | 1/3 3 4 2 5 1/2 1 





A comparison matrix is a positive reciprocal matriz. This type of matrix has 
a dominant positive eigenvalue and eigenvector. That eigenvector will be our 
weighting vector. The SAW program will compute (approximate) the weight- 
ing vector from a comparison matrix using an abbreviated power method. 

We must ensure that the comparison assessments are consistent; i.e., if a 
is preferred to b and b is preferred to c, then a is preferred to c; according to 
Saaty’s scheme, compute the consistency ratio C'R as a test. The value of CR 
must be less than or equal to 0.1 to be considered consistent. If CR > 0.1, the 
preference choices must be revisited and adjusted. First compute the largest 
eigenvalue À of the comparison matrix. Then calculate the consistency index 
CI 

à-n 


n—=1` 





CI = 


356 Multi-Attribute Decision Making 


Now determine the consistency ratio CR = CI/RI where RI, the random 
index (see [Saaty1980}), is taken from 


n{[1 2 3 4 5 6 7 8 9 10 
RI|0.0 0.0 0.58 0.90 112 124 1.32 141 145 1.49 — 





Enter the comparison matrix in Maple, calling it CM. Use the LinearAlge- 
bra package to find the dominant eigenvector of CM is A ~ 7.392. Thus, the 
consistency ratio is 

7.392-7 1 
CR = ————_ - — 
7-1 1.35 
Our CR is well below 0.1; we have a consistent prioritization of our criteria. 
Use SAW once more to find our new rankings. 


| > (CarModel, fnormal(SAW((PriceR | Car), CM),3)); 
Prius Fusion Volt Camry Sonata Leaf 


0.880 0.931 0.779 0.966 0.934 0.798 
L 0.911 0.963 0.806 1.0 0.967 0.826 


Preference ranking the criteria changed the result to Camry, Sonata, Fusion, 
Prius, Leaf, and Volt. The leaders changed places. 

Since the importance values chosen are subjective judgments, sensitivity 
analysis is a must. The sensitivity analysis for this example is left as an exer- 
cise. 


z 0.05. 





A Krackhardt “Kite network,” shown in Figure 8.1, is a simple graph with 
10 vertices that has three different answers to the question, “Which vertex is 
central?” depending on the definition of “central.” Krackhardt introduced the 
graph in 1990 as a fictional social network.’ 


Example 8.4. Krackhardt’s Kite Network. 

In the Kite network, Susan is “central” as she has the most connections, Steve 
and Sarah are “central” as they are closest to all the others, and Claire is 
“central” as a critical connection between the largest disjoint subnetworks. 
ORA-PRO, “a tool for network analysis, network visualization, and network 
forecasting,” returns the data in Table 8.9 for the kite. Use SAW to rank the 
nodes. 


After consulting with several network experts and combining their com- 
parison matrices,’ we have the weighting vector w = [TC, BTW, EC, INC] 


w = [0.47986 0.26215 0.155397 0.102592] . 





3D. Krackhardt, “Assessing the Political Landscape: Structure, Cognition, and Power in 
Organizations.” Admin. Sci. Quart. 35, 1990, pg. 342-369. 

1 Available from Netanomics, http://netanomics.com. 

5A standard method uses the harmonic means of the experts’ comparison matrices. 
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FIGURE 8.1: Krackhardt’s “Kite Network” 


TABLE 8.9: ORA Metric Measures for the Kite Network 


TC BTW EC INC 
Susan | 0.1806 0.2022 0.1751 0.1088 
Steve | 0.1389 0.1553 0.1375 0.1131 
Sarah | 0.1250 0.1042 0.1375 0.1131 
Tom | 0.1111 0.0194 0.1144 0.1009 
Claire | 0.1111 0.0194 0.1144 0.1009 
Fred | 0.0833 0.0000 0.0938 0.0975 
David | 0.0833 0.0000 0.0938 0.0975 
Claudia | 0.0833 0.3177 0.1042 0.1088 
Bob | 0.0556 0.1818 0.0241 0.0885 
Jenn | 0.0278 0.0000 0.0052 0.0707 








Table Legend: TC - Total Centrality; BTW - Betweenness; EC - Eigenvector Cen- 
trality; INC - Information Centrality 


The consistency ratio for the combined comparison matrix is 0.003; that is 
well less than 0.1. 
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It’s time for Maple. 


| > Labels := Vector[row]([Susan, Steve, Sarah, Tom, Claire, Fred, David, 
Claudia, Bob, Jenn]) : 
Kite := Matrizx([[0.1806, 0.2022, 0.1751, 0.1088], 

0.1389, 0.1553, 0.1375, 0.1131], 
0.1250, 0.1042, 0.1375, 0.1131], 
0.1111, 0.0194, 0.1144, 0.1009], 
0.1111, 0.0194, 0.1144, 0.1009], 
0.0833, 0.0000, 0.0938, 0.0975], 
0.0833, 0.0000, 0.0938, 0.0975], 
0.0833, 0.3177, 0.1042, 0.1088], 
0.0556, 0.1818, 0.0241, 0.0885], 
0.0278, 0.0000, 0.0052, 0.0707]]) : 
L = (0.47986, 0.26215, 0.155397, 0.102592) : 

[> Redes: := (Labels, fnormal(SAW (Kite, w), 3)}; 


Susan Steve Sarah Tom Claire Fred David Claudia Bob Jenn 
0.901 0.722 0.643 0.504 0.504 0.393 0.393 0.675 0.399 0.143 
1.0 0.801 0.714 0.560 0.560 0.436 0.436 0.749 0.443 0.158 








wt 





The results are easier to parse when we sort the array. The program MatrizSort 
is in the PSMv2 package with syntax MatrizSort( Matriz, (row/col), (options)). 
The options are sortby=’row’/’column’ and order=’ascending’/’descending’ 
with defaults ‘row’ and ‘ascending’. 


| > MatrixSort(Results,3, order = ’descending’); 
Susan Steve Claudia Sarah Claire Tom Bob David Fred Jenn 


0.901 0.722 0.675 0.643 0.504 0.504 0.399 0.393 0.393 0.143 
1.0 0.801 0.749 0.714 0.560 0.560 0.443 0.436 0.436 0.158 





We see the resulting rank order for “overall centrality” is 


Susan > Steve > Claudia > Sarah > = > Bob > Dee > Jenn 
Tom Fred 
SENSITIVITY ANALYSIS. We can apply sensitivity analysis to the weights to 
determine how changes impact the final rankings. We recommend using an 
algorithmic method to modify the weights. For example, if we reverse the 
weights for TC (total centrality) and BTW (betweenness), the rankings change 
to 


Susan Claudia Steve Sarah Bob Claire Tom David Fred Jenn 
0.822 0.792 0.661 0.563 0.457 0.384 0.384 0.293 0.293 0.109 
1.0 0.964 0.804 0.686 0.556 0.467 0.467 0.356 0.356 0.133 


Susan is still the “top node,” but Claudia and Steve have swapped for second; 
Bob is now above Claire, rather than tied. 
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Exercises 

Use SAW in each problem to find the 
(a) Equal weighting. 

(b) Choose and state weights. 


ranking under the weight: 


1. Rank order Hospital A’s procedures using the data below. 








Hecht A Procedure 
I II Tl IV 
X-Ray Time 6 5 4 3 
Laboratory Time 5 4 3 2 
Profit | $200 $150 $100 $80 





2. Rank order Hospital B’s procedures using the data below. 








Hospital B Procedure 
I II HI IV 
X-Ray Time 6 5 5 3 
Laboratory Time 5 4 3 3 
Profit | $190 $150 $110 $80 





3. A college student is planning to move to a new city after graduation. Rank 
the cities in order of best-to-move-to given the following data. 





Caty Raer a Pen Crime: Rate poa ' 
I 250 5 10 0.75 
II 325 4 12 0.60 
HI 676 6 9 0.81 
IV 1,020 10 6 0.80 
V 275 3 11 0.35 
VI 290 4 13 0.41 
VII 425 6 12 0.62 
VII 500 7 10 0.73 
IX 300 8 9 0.79 
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LEGEND: Housing Affordability: avg. home cost in $100,000s; Cultural 
Opportunity: events per month; Crime Rate: # crimes reported per month 
in 100’s; Quality of Schools: index in [0, 1]. 


4. Rank order the threat information collected by the Risk Assessment Office 
that is shown in Table 8.1 (pg. 339) for the Department of Homeland 
Security. 








8.4 Analytical Hierarchy Process 


The analytical hierarchy process (AHP) is a multi-objective decision anal- 
ysis tool first proposed by Thomas Saaty [Saaty1980]. The technique has 
become very popular—a Google Scholar search for “analytical hierarchy pro- 
cess” returns over 1.5 million items in under a tenth of a second. 


Description and Uses 


The process is designed for using either objective and subjective measures 
to evaluate a set of alternatives based upon multiple criteria, organized in a 
hierarchical structure as shown in Figure 8.2. The goal or objective is at the 
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FIGURE 8.2: A Generic Three-Layer AHP Hierarchy 


top level. The next layer holds the criteria that are evaluated or weighted. 
The bottom level has the alternatives which are measured against each crite- 
rion. The decision maker makes pairwise comparisons of the criteria in which 
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every pair is subjectively or objectively compared. Subjective assessments use 
Saaty’s nine-point scale that we introduced with SAW. (See Table 8.8.) 

The AHP process can be described as a method to decompose a problem 
into sub-problems. In most cases, the decision maker has a choice among many 
alternatives. Each alternative has a set of attributes or characteristics, AHP 
calls criteria, that can be measured, either subjectively or objectively. The 
attribute elements of the hierarchical process can relate to any aspect of the 
decision problem, either tangible or intangible, carefully measured or roughly 
estimated, well- or poorly understood—essentially anything that applies to 
the decision at hand. 

To perform an AHP, we need a goal or an objective and a set of alternatives, 
each with criteria (attributes) to compare. Once the hierarchy is built, the 
decision makers systematically pairwise-evaluate the various elements (com- 
paring them to one another two at a time), with respect to their impact on an 
element above them in the hierarchy. The decision makers can use concrete 
data about the elements or subjective judgments concerning the elements’ rel- 
ative meaning and importance for making the comparisons. Since subjective 
judgments are imperfect, sensitivity analysis will be very important. 

The process converts subjective evaluations to numerical values that can 
be processed and compared over the entire range of the problem. A prior- 
ity or numerical weight is derived for each element of the hierarchy, allowing 
diverse and often incommensurable elements to be rationally and consistently 
compared to one another. The final step of the process calculates a numerical 
priority score for each decision alternative. These scores represent the alter- 
natives’ relative ability to achieve the decision’s goal; they allow a straight- 
forward consideration of the various courses of action. 

AHP can be used by individuals for simple decisions or by teams working 
on large, complex problems. The method has unique advantages when impor- 
tant elements of the decision are difficult to quantify or compare, or where 
communication among team members is impeded by their different special- 
izations, lexicons, or perspectives. 


Methodology of the Analytic Hierarchy Process 


The procedure for AHP can be summarized as: 


STEP 1. Build the hierarchy for the decision following Figure 8.2. 
Goal: Select the best alternative 
Criteria: List c1, C2, €3,---; Cm 


Alternatives: List a1, a2, a3, ..., Gn 


STEP 2. Judgments and Comparison. 

Build comparison matrices using a 9-point scale of pairwise comparisons shown 
in Table 8.10 for the criteria (attributes) and the alternatives relative to each 
criterion. A problem with m criteria and n alternatives will require n + 1 
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matrices. Find the dominant eigenvector of each matrix. The power method 
is often used to calculate eigenvectors; see, e.g., [BurdenFaires2005]. The goal, 
in AHP, is to obtain a set of eigenvectors of the system that measure the 
importance of alternatives with respect to the criteria. 


TABLE 8.10: Saaty’s Nine-Point Scale 


Importance level | Definition 





1 Equal importance 

3 Moderate importance 

5 Strong importance 

7 Very Strong importance 
9 Extreme importance 





TABLE NOTES: Even numbers represent intermediate importance levels which 
should only be used as compromises. If the importance level of A to B is 3, 
then that of B to A is 1/3, the reciprocal. 


Saaty’s consistency ratio CR measures the consistency of the pairwise assess- 
ments of relative importance. The value of CR must be less than or equal to 
0.1 to be considered valid. To compute CR, start by approximating the largest 
eigenvalue of the comparison matrix. Then calculate the consistency index 
CI with 

A-—n 


n—-1° 





CI = 


Finally, CR = CI/RI, where RI is the random index (from [Saaty1980]) 
found from 
anli 2 3 4 5 6 7 8 © 1 
RI | 0.0 0.0 0.58 0.90 1.12 1.24 1.32 1.41 1.45 1.49 - 





If CR > 0.1, we must go back to our pairwise comparisons and repair the 
inconsistencies. In general, consistency ensures that if A > B and B > C, 
then A > C for all A, B, and C. 


STEP 3. Combine all the alternative comparison eigenvectors into a matrix, 
then multiply by the criteria matrix’s eigenvector to obtain an overall com- 
parative ranking. 


STEP 4. After the m criterion weights are combined into an n x m normalized 
matrix (for n alternatives by m criteria), multiply by the criteria ranking 
vector to obtain the final rankings. 


STEP 5. Interpret the order presented by the final ranking. 
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Strengths and Limitations 


AHP is very widely used in business, industry, and government. The tech- 
nique is quite useful for discriminating between competing options for a range 
of objectives needing to be met. Even though AHP relies on what might be 
seen as obscure mathematics—eigenvectors and eigenvalues of positive recip- 
rocal matrices—the calculations are not complex, and can be carried out with 
a spreadsheet. A decision maker doesn’t need to understand linear algebra the- 
ory to use the technique, but must be aware of its strengths and limitations. 

AHP’s main strength is producing a ranking of alternatives ordered by 
their effectiveness relative to the criteria’s weighting to meet the project goal. 
The calculations of AHP logically lead to the alternatives’ ranking as a con- 
sequence of preference judgments on the relative importance of the criteria 
and on how the alternatives satisfy each of the criteria. Making accurate and 
good-faith relative importance assessments is critical to the method. Manu- 
ally adjusting the pairwise judgments to obtain a predetermined result is quite 
hard, but not impossible. 

A further strength is that AHP provides a heuristic for detecting incon- 
sistent judgments in pairwise comparisons. When there are a large number 
of criteria and/or alternatives, inconsistencies can easily be hidden by the 
problem’s size; AHP highlights hidden inconsistencies. 

A main limitation of AHP comes from being based on eigenvectors of 
positive reciprocal matrices. This basis requires that symmetric judgments 
must be reciprocal: if A is 3 times more important than B, then B is 1/3 as 
important as A.° This restriction can lead to problems such as “rank reversal,” 
a change in the ordering of alternatives when criteria or alternatives are added 
to or deleted from the initial set compared. Several modifications to AHP 
have been proposed to ameliorate this and other related issues. Many of the 
enhancements involved ways of computing, synthesizing pairwise comparisons, 
and/or normalizing the priority and weighting vectors. In the next section, 
we'll see TOPSIS, a method that corrects rank reversal. 

Another limitation is implied scaling in the results. The final ranking 
indicates that one alternative is relatively better than another, not by how 
much. For example, suppose that rankings for alternatives (A,B,C) are 
(0.392, 0.406, 0.204). The values only imply that alternatives A and B are 
about equally good (~ 0.4), and C is worse (~ 0.2). The ranking does not 
mean that A and B are twice as good as C. 

Hartwich [Hartwich1999] criticized AHP for not providing sufficient guid- 
ance about structuring the problem to be solved; that is, how to form the 
levels of the hierarchy for criteria and alternatives. When project team mem- 
bers carry out rating items individually or as a group, guidance on aggregating 
separate criteria assessments is necessary. As the number of levels in the hier- 
archy increases, the complexity of AHP increases faster; n criteria require 
O(n”) comparisons. 





®See Saaty’s book [Saaty1980] for his rationale. 
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Nevertheless, AHP is a very powerful and useful decision tool when used 
intelligently. 


Sensitivity Analysis 


Using subjective pairwise comparisons in AHP makes sensitivity analysis 
extremely important. How often do we change our minds about the rela- 
tive importance of objects, places, or things? Often enough that we should 
test the pairwise comparison values to determine the robustness of AHP’s 
rankings. Test the decision maker weights to find the “break point” values, 
if they exist, that change the alternatives’ rankings. At a minimum, perform 
trial-and-error sensitivity analysis using a numerical incremental analysis of 
the weights. Numerical incremental analysis works by incrementally changing 
one parameter at a time (OAT), finding the new solution, and graphically 
showing how the rankings change. Several variations of this method are given 
in [BarkerZabinsky2011] and [Hurley2001]. 

Chen [ChenKocaoglu2008] divided sensitivity analysis into three main 
groups: numerical incremental analysis, probabilistic simulations, and math- 
ematical models. Probabilistic simulation employs the use of Monte Carlo 
simulation ({[ButlerJiaDyer1997]) that makes random changes in the weights 
and simultaneously explores the effect on rankings. Modeling may be used 
when it is possible to express the relationship between the input data and the 
solution results as a mathematical model. Leonelli’s [Leonelli2012] master’s 
thesis outlines these three procedures. 

We prefer numerical incremental analysis weight adjusting with the new 
weight w; given by 
/ 


are 
sae Wj (8.4) 


A 1l—w 
Wy 





where Wp is the original weight of the criterion to be adjusted, and w, is the 
value after the criterion was adjusted [AlinezhadAmini2011]. 

Whichever method is chosen, sensitivity analysis is always an important 
part of an AHP solution. 


Examples Using AHP 


Let’s look at a selection of examples using AHP starting with a quite simple 
3-criteria for 3 alternatives. 


Example 8.5. Selecting a VHF Transceiver Antenna. 

An amateur radio operator wants to build and install a new VHF antenna 
choosing from designs for a vertical 1/4-wave antenna, a 3-element Yagi 
antenna, and a J-pole antenna. See Figure 8.3. The three main criteria will 
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be size, antenna gain (how much the antenna focuses the signal), and ease of 
assembly.” 


1/4-Wave Vertical 3-Element Yagi J-Pole 


FIGURE 8.3: VHF Antenna Types 


We’ll use the AHP and AnalyzeComparisonMat programs from the book’s 
Maple package PSMv2. After loading the package via with, use the Describe 
command to see a brief description and a list of arguments. 

STEP 1. Build the 3-level hierarchy listing the criteria and alternatives. 
Goal: Select the best antenna 
Criteria: Size, Gain, Assembly 
Alternatives: Vertical, Yagi, J-Pole 


| > CritLabel := ["Size", "Gain", "Assembly"] : 
AltLabel := ["Vertical", "Yagi", "J-Pole"| : 





STEP 2. Perform the pairwise comparisons using Saaty’s 9-point scale. 

Use the row and column orders specified by the lists in Step 1. The ham oper- 
ator’s choices are as follows. 

Criteria comparison matriz: 


1 0.2 0.333 
> CritPM := | 5 1 3 
3 0.333 1 


AnalyzeComparisonMat(%); 
[[ 0.1047 0.6370 0.2582 ] , 0.03276] 


The consistency ratio 0.03 is very good. 
Alternatives by Size comparison matriz: 





T Adapted from W. Bauldry, “Choosing an Antenna Mathematically,” 2018. 
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i 1 5 0.5 
> SizePM := | 0.2 1 0.2 
2 5 1 
AnalyzeComparisonMat(%); 
[| 0.3522 0.08875 0.5591 | ,0.04655] 


The consistency ratio 0.05 is very good. 


Alternatives by Gain comparison matriz: 


1 0.143 0.33 
> GainPM := | 7 1 5 
3 0.2 1 


AnalyzeComparisonMat(%); 
[[ 0.08080 0.7308 0.1884 ] 0.05431] 
The consistency ratio 0.05 is very good. 
Alternatives by Assembly comparison matriz: 





1 5 3 
> AssemblyPM := | 0.2 1 0.333 
0.33 3 1 
AnalyzeComparisonMat(%); 
[[ 0.6374 0.1048 0.2578 | , 0.03017] 


The consistency ratio 0.03 is very good. 


STEP 3. The AHP program will combine the eigenvectors from the three 
alternatives’ priority matrices to form the overall alternative priority matrix. 


STEP 4. Obtain the AHP rankings. 
| > Results := AHP(CritLabels, CritPM, AltLabels, SizePM, GainPM, 





AssemblyPM ); 
"Size" "Gain" "Assembly" 
Criteria Weights = 
0.1047 0.6370 0.2582 
ConsistencyRatio = 0.03276 
"Vertical" "Yagi" "J-Pole" 
AlternativesRanking = 
0.2529 0.5020 0.2451 





0.3522 0.08080 0.6374 
AltPriorityMatriz = | 0.08875 0.7308 0.1048 
0.5591 0.1884 0.2578 








Analytical Hierarchy Process 367 


Interpretation of Results. The Yagi is the best choice antenna, with the ver- 
tical and J-pole at essentially the same rating. Since the Yagi was a clear 
preference for gain, and gain was the highest rated criteria, the rankings pass 
an initial “common-sense test.” 


The necessary sensitivity analysis will be left to the reader. The first step 
will be to find the break-even points for the criteria ratings. 


Example 8.6. Selecting a Car Redux. 
Revisit Example 8.3 “Selecting a Car” with the data presented in Table 8.7 
(pg. 353), but now use AHP to rank the models. 


STEP 1. Build the 3-level hierarchy and list the criteria from the highest to 
lowest priority (mainly for convenience). 


Goal: Select the best car 
Criteria: Cost, MPG-City, MPG-Hwy, Safety, Reliab., Perform., Style 


Alternatives: Prius, Fusion, Volt, Camry, Sonata, Leaf 


| > Criteria := [Cost, MPG-City, MPG-Hwy, Safety, Reliab., Perform., Style] : 


| > Model := [Prius, Fusion, Volt, Camry, Sonata, Leaf] : 





STEP 2. Perform the pairwise comparisons using Saaty’s 9-point scale. 

We chose the priority order as: Cost, Safety, Reliability, Performance, MPG- 
City, MPG-Hwy, and Interior & Style. Putting the criteria in priority order 
allows for easier pairwise comparisons. A spreadsheet similar to Figure 8.4 
organizes the pairwise comparisons nicely. Enter the comparisons in Maple as 
the matrix PCM. 


[ 1.0 2.0 2.0 3.0 4.0 5.0 6.0 
0.500 1.0 2.0 3.0 4.0 50 50 
0.500 0.500 1.0 2.0 2.0 3.0 3.0 
> PCM := | 0.333 0.333 0.500 1.0 1.0 2.0 3.0 
0.250 0.250 0.500 1.0 1.0 2.0 3.0 
0.200 0.200 0.333 0.500 0.500 1.0 2.0 


0.167 0.200 0.333 0.333 0.333 0.500 1.0 


AnalyzeComparisonMat( PCM); 
[[ 0.3178 0.2545 0.1515 0.09450 0.08783 0.05443 0.03939 | , 0.01932] 


The consistency ratio 0.02 is very good. 





STEP 3. We enter the AltM matrix with columns listed in the priority order 
we chose in Step 1; the order must match the PCM matrix. The cost data 
does not follow the rubric “larger is better”; therefore, the reciprocal of cost 
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A B (a D E 
1 [ : l n 7 
2 T 
3 Cost to MPG-City A 2 
4 MPG-Hwy A 2 
5 Safety A 3 
6 Reliability A 4 
7 Performance A 5 
8 Interior & Style A 6 
9 MPG-City to MPG-Hwy A a 
10 Safety A 3 
11 Reliability A 4 
12 Performance A 5 
13 Interior & Style A 5 
14 MPG-Hwy to Safety A 2 
15 Reliability A 2 
16 Performance A 3 
17, Interior & Style A 3 
18 Safety to Reliability A 1 
19 Performance A 2 
20 Interior & Style A 3 
21 Reliability to Performance A 2 
22 Interior & Style A 3 
23 Performance to Interior & Style A 2 


FIGURE 8.4: Pairwise Comparison of Criteria 


is used for the first column. 

[ 0.0360 9.4 3.0 7.5 44.0 40.0 8.7 
0.0351 9.6 4.0 84 47.0 47.0 8.1 
0.0258 9.6 3.0 8.2 35.0 40.0 6.3 
0.0392 9.4 5.0 7.8 43.0 39.0 7.5 
0.0364 9.6 5.0 7.6 36.0 40.0 8.3 

L 0.0276 9.4 3.0 8.1 36.2 40.0 8.0 


Standard methods for dealing with a variable like cost include: (1) replace 
cost with 1/cost, (2) use a pairwise comparison with the nine-point scale, (3) 
use a pairwise comparison with ratios cost;/cost;, or (4) remove cost as a 
criteria and a variable, perform the analysis, and then use a benefit / cost ratio 
to re-rank the results. Many analysts prefer (4) when cost figures are large 
and dominate the procedure. 


> AItM := 





For the alternatives, we either use the raw data, or we can use pairwise com- 
parisons by criteria for how each alternative fares versus its competitors. Here, 
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we take the raw data replacing cost by 1/cost, then normalize the columns. 


[> AMN = evalf (Matric ( [sea (orang ogy = 1-7) ) 4) 


STEP 4. Execute the procedure and obtain the output to interpret. 
Note: Our AHP program will accept either the alternatives’ comparison matri- 
ces or the eigenvectors merged into a single alternative comparison matrix. 





|> Result := AHP (Criteria, PCM, Model, AltMN)); 











Criteria Weights = 
"Cost" "MPG-City" "MPG-Hwy" "Safety" "Reliab." "Perform." "Style" 
0.3178 0.2545 0.1515 0.09450 0.08783 0.05443 0.03939 
ConsistencyRatio = 0.01932 
"Prius" "Fusion" "Volt" "Camry" "Sonata" "Leaf" 
AlternativesRanking = 
L 0.1659 0.1759 0.1468 0.1833 0.1776 0.1504 





Once again, use MatrizSort to make the results easier to parse. 


| > MatrizSort(rhs( Result[3]), 2, order = ’descending’); 
Bees "Sonata" "Fusion" "Prius" "Leaf" "Volt" 


0.1833 0.1776 0.1759 0.1659 0.1504 0.1468 


We see the resulting rank order for the best car is 








Camry > panaia > Prius > Leaf > Volt. 
Fusion 
SENSITIVITY ANALYSIS. We alter our pairwise comparison values to obtain 
a new set of weights and obtain new results: Camry, Fusion, Sonata, Prius, 
Leaf, and Volt. First we adjust the weights and place the weights into a new 
comparison matrix. 

We altered cost, the largest decision criteria, by lowering its value incre- 
mentally. Then we created a matrix of the new decision weights that included 
the original set of weights as a reference. We then multiplied it by the trans- 
pose of the normalized matrix of alternatives still in criteria order. Using 
Statistics:- Rank finds the ordering. With the changes in the decision weights, 
the cars were ranked the same. That is, the resulting values have changed, 
but not the relative rankings of the cars. The Maple work follows. 


| > with(LinearAlgebra) : 


| > rhs(Result[1]}); 
wy = %[2] : 
"Cost" "MPG-City" "MPG-Hwy" "Safety" "Reliab." "Perform." "Style" 


| [0.3178 0.2545 0.1515 0.09450 0.08783 0.05443 0.03939 





370 Multi-Attribute Decision Making 


| > AdjVec := fnormal( Vector[row]({[—1, 1./6$6]), 4); 
L [ —1 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 | 
[> we := fnormal(w, + 0.025 - Adj Vec, 4); 
E [ 0.2928 0.2587 0.1557 0.09867 0.09200 0.05860 0.04356 ] 
[> for i from 3 to 6 do 
w; = fnormal(wi—1 + 0.025 - Adj Vec, 4) : 
end do: 


NewW := Matrix(convert( convert ~ (w, list), listlist)); 
NewW := 
0.3178 0.2545 0.1515 0.09450 0.08783 0.05443 0.03939 


0.2928 0.2587 0.1557 0.09867 0.09200 0.05860 0.04356 
0.2678 0.2629 0.1599 0.1028 0.09617 0.06277 0.04773 
0.2428 0.2671 0.1641 0.1070 0.1003 0.06694 0.05190 
0.2178 0.2713 0.1683 0.1112 0.1045 0.07111 0.05607 


0.1928 0.2755 0.1725 0.1154 0.1087 0.07528 0.06024 


> fnormal( Transpose(AltM . Transpose( NewW))), 4) : 
(Statistics:-Rank ~ (convert(%, listlist))); 
4,6,1,5,3,2 


4,6,1,5,3,2 
4,6,1,5,3,2 
4,6,1,5,3,2 
4,6,1,5,3,2 
4,6,1,5,3,2 











Again, we recommend using sensitivity analysis to try to find break points, 
if any exist. 

In the next sensitivity analysis, take the smallest criteria, Interior and 
Style, and increase its value by 0.1, 0.2, and 0.25, adjusting the other weights 
proportionally. The new results show a change in rank ordering between the 
3rd and 4th increments. Thus the break point is adding between 0.2 and 0.25 
to the criteria weight for Interior and Style. Verify these computations with 
Maple! 


Example 8.7. The Kite Network Redux. 
Revisit Krackhardt’s Kite social network; search for the key influencer nodes. 
According to Newman® there are four metrics that contribute to identifying 





8See Chapter 4 of M. Newmann, Networks: An Introduction, Oxford Univ Press, Oxford, 
2010. Available at http://dx.doi.org/10.1093/acprof:0so/9780199206650.001.0001. 
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the key nodes of a network. In our priority order, the key criteria are: 
Total Centrality, Betweenness, Kigenvector Centrality, Closeness Centrality 


Assume we have the outputs from the network analysis program ORA- 
Pro (which is not shown here due to the volume of output). Take the metrics 
from ORA-Pro and normalize each column. The columns for each criterion are 
placed in the matrix X = [z;;]. Define w; to be the weight for each criterion. 
Limit the size of X matrix to 8 alternatives with the four criteria for this 
example. 

Next, assume we have obtained the criteria pairwise comparison matrix 
from the decision maker. Using the output from ORA-Pro and normalizing 
the results, we are ready for AHP to rate the alternatives within each criterion. 
We provide a sample pairwise comparison matrix with Saaty’s nine-point scale 
for weighting the Kite network criteria. The consistency ratio is CR = 0.01148, 
which is much less than 0.1, so our pairwise comparison matrix is consistent. 
We continue with Maple. 


| > CritLabels := [TC, BTW, EC, CC]: 
AltLabels := bo Claire, Fred, Sarah, Susan, Steve, Claudia, Tom] : 


j E 2 34] 


1 2 3 


> PCM := 
0.3333 0.5 1 2 
0.25 0.3333 0.5 1 
0.11111111 0.019407559 0.114399403 0.100733634 
0.11111111 0.019407559 0.114399403 0.100733634 
0.08333333 0.0 0.093757772  0.09734763 
0.125 0.104187947 0.137527978 0.100733634 
> AIM := 


0.180555556 0.202247191 0.175080826 0.122742664 
0.138888889  0.15526047 0.137527978 0.112866817 
0.083333333 0.317671093 0.104202935 0.108634312 
0.055555556 0.181818182 0.024123352 0.088318284 
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[> Result := AHP(CritLabels, PCM, AltLabels, AltM); 


Result := 
T TC BTW EC CC 


0.4673 0.2772 0.1601 0.09543 
ConsistencyRatio = 0.01148 


Criteria Weights = 





AlternativesRanking = 
Bob Claire Fred Sarah Susan Steve Claudia Tom | 





0.09303 0.09303 0.06903 0.1298 0.1967 0.1536 0.1681 0.09676 





As before, sort the results to make them easier to read. 

| > MatrixSort(rhs(Result[3]),2, order = ‘descending’); 
Susan Claudia Steve Sarah Tom Claire Bob Fred 
0.1967 0.1681 0.1536 0.1298 0.09676 0.09303 0.09303 0.06903 


AHP gives Susan as the key node. However, the bias of the decision maker 
is important in the analysis of the weights of the criteria. The Betweenness 
criterion is rated 2 to 3 times more important than the others. 








SENSITIVITY ANALYSIS. Changes in the pairwise comparisons for the criteria 
cause fluctuations in the key nodes. The reader should change the pairwise 
comparisons given above so that Total Centrality is not so dominant, and 
rerun the AHP as we did with the previous example. 








Exercises 


1. Redo Section 8.3’s Exercises (pg. 359) using AHP. Compare with your 
previous results using SAW. 


2. In each of the problems above, perform sensitivity analysis by changing 
the weight of your dominant criteria until it is no longer the highest. Did 
the change affect your rankings? 


3. In each of the problems above, find break points, if any exist, for the 
weights. 


4. Suppose a criterion has no break point. What does this indicate about the 
sensitivity of the solution to that criterion’s weight? 
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8.5 Technique of Order Preference by Similarity to the 
Ideal Solution 


In 1981, Hwang and Yoon [HwangYoon1981] introduced the Technique of 
Order Preference by Similarity to the Ideal Solution (TOPSIS) as a multi- 
criteria decision analysis method that is based on comparing the relative “dis- 
tances” of alternatives from a theoretical best solution and a theoretical worst 
solution. The optimal alternative will have the shortest geometric distance 
from the best or positive ideal solution, and the longest geometric distance 
from the worst, or negative ideal solution. The method is a compensatory 
aggregation that compares a set of alternatives by identifying weights for each 
criterion, normalizing the scores for each criterion, and calculating a distance 
between each alternative and the theoretical ideal alternative based on the 
best score in each criterion. TOPSIS requires that the criteria are monotoni- 
cally increasing or decreasing. Normalization is usually required as the criteria 
often have incompatible dimensions. Compensatory methods allow trade-offs 
between criteria, where a poor result in one criterion can be negated by a good 
result in another criterion. This compensation provides a more realistic form 
of modeling than non-compensatory methods which often include or exclude 
alternative solutions based on strict cut-off values. 

A 2012 survey by Behzadian et al.’ finds the main areas of application of 
TOPSIS include 


e Supply Chain Management and Logistics, 

e Design, Engineering and Manufacturing Systems, 
e Business and Marketing Management, 

e Health, Safety and Environment Management, 

e Human Resources Management, 

e Energy Management, 

e Chemical Engineering and 

e Water Resources Management. 


We begin with a brief discussion of the framework of TOPSIS as a method 
of decomposing a problem into sub-problems. Typically, a decision maker must 
choose from many alternatives each having a set of attributes or characteristics 
that can be measured subjectively or objectively. The attributes can relate to 
any tangible or intangible aspect of the decision problem. Attributes can be 





°Behzadian, Khanmohammadi, Yazdani, and Ignatius, “A state-of the-art survey of 
TOPSIS applications,” Expert systems with Applications, 39 (2012): 13051-13069. 
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carefully measured or roughly estimated, or be well or poorly understood. 
Basically, anything that applies to the decision at hand can be used in the 
TOPSIS process. 


Methodology 
The TOPSIS process is carried out as follows. 


STEP 1. Create an evaluation matrix X = [x;;] consisting of m alternatives 
and n criteria where xij is alternative it’s value for criterion j. 


C1 C2 eee Cn 

Aj T11 T12 `` Lin 

x = Ag T21 T22 ` Ün 
Am Lm Lm2 ae mn 


STEP 2. X is normalized to form R = [rij|]mxn using the normalization 


Tij 


Tij = ar 3 
y X k=l Tj 


STEP 3. Calculate the weighted normalized decision matrix T. Weights must 
total 1 (100%), and can come from either the decision maker directly, or 
by computation such as from the eigenvector of a comparison matrix using 
Saaty’s nine-point scale. T is given by 


for i = 1..m and j = 1..n. 


T= lutini 
i.e., multiply each column by its weight. 


STEP 4. Determine each criterion’s best alternative A, and worst alternative 
Aw. Examine each attribute’s column and select the largest and smallest val- 
ues. If the criterion’s values imply larger is better (e.g., profit), then the best 
alternatives are the largest values; if the values imply smaller is better (such 
as cost), the best alternative is the smallest value. (Whenever possible, define 
all criteria in terms of positive impacts.) Separate the index set of the criteria 
into two classes: 


J, = {indices of criteria having positive impact}, 
and 


J_ = {indices of criteria having a negative impact}. 
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Now define the best, the ideal positive alternative, as 


Ap = {(min(t,;|2 = 1..m; j E J=} 5 (max(ty,; |i = 1..m; j E J4)} 
= {tozi = ln}, 


and the worst, the ideal negative alternative, as 


Ay = {(max(t,;|2 = 1..m; j E J_) 4 (min(t,,; |i = Lm; j E J4)} 
= {bazl = 1n}, 


STEP 5. Calculate the Euclidean distances between each alternative and the 
ideal positive alternative 








X (ty tes)", fori=1..m. 


j=1 


STEP 6. Calculate each alternative’s similarity to the worst condition 


diw . 
Siw = das + da for i = 1..m. 
Note that 0 < Siw < 1 for each i, and that 


Siw = 1 iff alternative 7 equals the ideal positive alternative, dip = 0; and 


Siw = 0 iff alternative 7 equals the ideal negative alternative, diw = 0. 


STEP 7. Rank the alternatives by their Siw values. 


Normalization 


Wojciech Salabun!? presents four methods of normalization: 3 methods of 
linear normalization and the vector method we used in Step 2. Each method 
has variants for “profit” (larger is better) and “cost” (smaller is better). Vector 
normalization, which uses nonlinear distances between single dimension scores 
and ratios, should produce smoother trade-offs [Hwang Yoon1981]. 





10W. Salaban, “Normalization of attribute values in TOPSIS method,” Chapter 4 in 
Behzadian et al, Nowe Trendy w Naukach Inzynieryjnych, CREATIVETIME, 2012. 
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Strengths and Limitations 


TOPSIS is based on the concept that the chosen alternative should have the 
shortest geometric distance from the positive ideal solution and the longest 
geometric distance from the negative ideal solution. See Figure 8.5. 


1 Positive'Ideal Soln 


To voy 


Increasing Desirability for C2 ————> 





Negative Ideal Soln ! 





Increasing Desirability for Ci ————> 


FIGURE 8.5: TOPSIS with Two Criteria 


Two main advantages of TOPSIS are its ease of use and ease of imple- 
mentation. A standard spreadsheet can handle the computations, and deep 
mathematical expertise is not required. 

The main weakness of TOPSIS is that subjectivity in setting criteria and 
weights can inordinately influence the rankings produced. As always, sensitiv- 
ity analysis is a must. 


Sensitivity Analysis 


Sensitivity analysis is essential to good modeling. The criterion weights are 
the main target of sensitivity analysis to determine how they affect the final 
ranking. The same procedures discussed for AHP are all useful for TOPSIS. 
We will again use Equation 8.4 (pg. 364) to adjust weights in our sensitivity 
analysis [AlinezhadAmini2011]. 
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Examples using TOPSIS 


We'll revisit examples from AHP and SAW so as to compare results, method 
efficacy, and computational complexity. 


Example 8.8. Selecting a Car Redux. 

The decision maker’s weights used for AHP will be used here slightly modified 
with cost not inverted. Use the PCM matrix again, but with cost not inverted. 
The input data for the alternatives must be in the same order as the prioritized 
criteria. 


| > Labels := ["Prius", "Fusion", "Volt", "Camry", "Sonata", "Leaf" : 
27.8 94 3 75 44 40 8.7 


28.5 96 4 84 47 47 8&1 

38.7 96 3 82 35 40 6.3 
ALM := 

25.5 94 5 7.8 438 39 7.5 

275 96 5 76 36 40 83 


36.2 94 3 81 36.2 40 8.0 
DMWR := (0.3612, 0.2093, 0.1446, 0.1167, 0.0801, 0.0530, 0.0351] : 
Use the TOPSIS program from the book’s PSMv2 package. 


[> with(PSMv2) : 


> Describe( TOPSIS); 


# TOPSIS(AltM, Weights, <Best=[max/min,...]>) returns the 
# ranking of the alternatives given the Weights. Optional 
# argument Best=[max/min,...] lists type of ’best’ element 
# for each criterion. 
TOPSIS( Labels::Vector, list, A::Matrix, W::Vector, list, 
{Best := [max$15] } ) 
| > [min, max$6]; 
Result := TOPSIS(Labels, AltM, DMWR, Best = %); 
[min, max, max, max, max, max, maz] 
“Prius” “Fusion” “Volt” “Camry” “Sonata” “Leaf” 


0.6156 0.7159 0.06138 0.9087 0.8098 0.1766 


Result := 





The ranked order of alternatives are: Camry (0.9087), Sonata (0.8098), Fusion 
(0.7159), Prius (0.6156), Leaf (0.1766) and Volt (0.06138). How does this com- 
pare to the AHP rankings? 

To begin sensitivity analysis, reduce the weight of cost by steps of 0.05, 
modifying the other weights linearly to keep }> w; = 1, until cost is overtaken 
as the highest weighted criteria. We find no changes in the rank-ordering of 
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our alternatives until 10 steps. 


[> AdjVec := [—1,1./6$6] : 
DMWR; 
NewDMWR := fnormal(DMWR + 0.05 - Adj Vec, 4); 

(0.3612, 0.2093, 0.1446, 0.1167, 0.0801, 0.0530, 0.0351] 
NewDMWR := (0.3112, 0.2176, 0.1529, 0.1250, 0.08843, 0.06133, 0.04343] 
[> NewResult(1] := TOPSIS (Labels, AtM, NewDMWR, 

Best = [min, maxr$6])[2, ..]; 
NewResult, := (0. 5728 0. 6992 0.07139 0.8888 0.7947 0. 1744] 
| > for i from 2 to 10 do 
NewDMWR := fnormal(NewDMWR + 0.05 - Adj Vec, 4); 
NewResult|i] := TOPSIS(Labels, AltM, NewDMWR, 
Best = [min, max$6]) [2, ..]; 


L end do: 
> NewResult := Matrix(convert(convert ~ (NewResult, list), list)) : 


| > Statistics:-LineChart(NewResult, gridlines = true, symbolsize = 14, 
symbol = solidboz, title = "Sensitivity Analysis Output", font = 
[TIMES, 16], labels = ["New Weights", "Alternative Rankings", 
labeldirections = [horizontal, vertical]); 

Sensitivity Analysis Output 





> 
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>o > 
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Alternative Rankings 
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New Weights 





We see the rankings are stable until some switching at the 10th step. 
Modifying the other weights and determining the results’ sensitivity to 


those changes is left to the exercises. 

Example 8.9. The Kite Network Redux. 

Revisit analyzing the Kite Network, this time with TOPSIS, to find the main 
influencers in the network. 
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Use the same four criteria as Example 8.7 (pg. 370): 


1. Total Centrality( TC) 2. Betweenness(BTW) 
3. Eigenvector Centrality(EC) 4. Closeness Centrality(CC) 


[> CritLabels := [TC, BTW, EC, CC]: 
AltLabels := | Bob, Claire, Fred, Sarah, Susan, Steve, Claudia, Tom] : 
0.11111111 0.019407559 0.114399403 0.100733634 


O.11111111 0.019407559 0.114399403 0.100733634 
0.08333333 0.0 0.093757772 0.09734763 
0.125 0.104187947 0.137527978 0.100733634 
AAE 0.180555556 0.202247191 0.175080826 0.122742664 
0.138888889 0.15526047 0.137527978 0.112866817 
0.083333333 0.317671093 0.104202935 0.108634312 
0.055555556 0.181818182 0.024123352 0.088318284 
Tse the same weights as in AHP which have a good CR of 0.01. 
|> DMWR := (0.4673, 0.2772, 0.1601, 0.09543] : 


[> TOPSIS(AltLabels, AltM, DMWR); 

Result([1] := convert(%|2, ..], list) : 

MatrizSort(%%, 2, order = ’descending’); 

Susan Claudia Steve Sarah Tom Claire Bob Fred 


| | 0.7648 0.5850 0.5801 0.4576 0.3457 0.3033 0.3033 0.1766 


To begin sensitivity analysis, change the weight for the largest criteria, Total 
Centrality. Adjust the other weights linearly to keep ` w; = 1. 





[> AdjVec := [—1,1./3$3] : 
NewDMWR := fnormal(DMWR + 0.05 - Adj Vec, 4) : 

|  Result[2] := convert( TOPSIS(AltLabels, AltM , NewDMWR)(2, ..], list) : 
| > for i from 3 to 8 do 

NewDMWR := fnormal(NewDMWR + 0.05 - Adj Vec, 4); 

Result[i] := convert( TOPSIS(AltLabels, AltM, NewDMWR)(2, ..], list); 
L end do: 
| > NewResult := Matrix(convert( Result, list)) : 
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| > Statistics:-LineChart(NewResult, gridlines = true, symbolsize = 14, 
symbol = solidboz, title = "Sensitivity Analysis Output", font = 
[TIMES, 16], labels = ["New Weights", "Alternative Rankings", 
labeldirections = [horizontal, vertical]); 
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The plot of the results from sensitivity analysis shows that two sets of alter- 
natives change place. This shift is a significant change, and again emphasizes 
the importance of sensitivity analysis. 





Exercises 


1. Redo Section 8.3’s Exercises (pg. 359) using TOPSIS. Compare with your 
previous results using SAW and using AHP. 


2. In each of the problems above, perform sensitivity analysis by changing 
the weight of your dominant criteria until it is no longer the highest. Did 
the change affect your rankings? 


3. In each of the problems above, find break points, if any exist, for the 
weights. 


4. Suppose the dominant criterion has no break point. What does this indi- 
cate about the sensitivity of the solution to that criterion’s weight? 


5. Suppose the least weighted criterion has no break point. Can this criterion 
be eliminated without affecting the rankings? 
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Projects 


Project 1. Write a program using the technology of your choice to imple- 
ment: 

(a) SAW, 

(b) AHP, and 

(c) TOPSIS. 


Project 2. Enhance your program to perform sensitivity analysis. 


Project 3. Perform and discuss a comparative analysis using the Kite net- 
work and each MADM method. 








8.6 Methods of Choosing Weights 


Last, we consider several standard methods for choosing the weights for SAW, 
AHP, and TOPSIS. 


Rank Order Centroid Method 


The rank order centroid (ROC) method is a simple way of giving weights to 
a number of items ranked according to their importance. In general, decision 
makers can rank items much more easily and quickly than assigning weights. 
This method takes those ranks as inputs, and converts them to weights for 
each of m items based on the following schema. 


1. List the m items in rank-order from most important to least important. 


2. Assign the ith item in the list the weight w; where 
Il 


E.g., for four items: 








[> m:=4: i i 
weights := sea (v = — : sum ($+ = im) = 1.m) | : 
m k 
add (weights); 
hi 25 13 T 1 
= = = = w4 = 
WwWeErghtsS w1 18g 4g’ 3 48” 4 16 
L wy + Wz + wz +w4=1 
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ROC is simple and easy to follow, but it gives weights which are highly dis- 
persed. As an example, consider weighting the factors in the example below. 


Example 8.10. Agency Project. 
Suppose an agency has selected four main criteria and put them in rank order 
as 


1. Shortening Schedule, 2. Project Cost, 3. Agency Control, 4. Competition 


Using ROC to rank the four criteria based on their importance and influ- 
ence as listed gives the ranks calculated above: 


wy = 0.52, wa = 0.27, w3 = 0.15, and w4 = 0.06. 


These weights almost eliminate the effect of the fourth factor, encouraging 
competition. This ranking could prove to be an issue. 


Ratio Method 


The Ratio Method is another simple way of calculating weights for a selection 
of critical factors. The decision maker first rank-orders all items according to 
their importance. Next, each item is given an initial weight based on its rank 
beginning with the lowest ranked item given an initial weight of 10. The deci- 
sion maker moves up the ranking giving each successive item a weight that is a 
multiple of 10 and is an amount greater than the previous indicating the item’s 
relative importance. Last, the raw weights are normalized by dividing each by 
the sum of all. Each increase in weight is based on the differences between 
the items’ importance and is a subjective judgment of the decision maker. 
Ranking the items in the first step helps to assign more accurate weights. 

Rating the factors from Example 8.10 using the ratio method with initial 
weights assigned by the Project Manager gives 


Criterion | 1 2 3 4 
Raw Weight | 50 40 20 10 
Normalized Weight | 0.417 0.333 0.167 0.083 





While just as easy as ROC, the ratio method also suffers from dispersed 
weightings. 


Pairwise Comparison 


We’ve used pairwise comparison in several examples of SAW, AHP, and TOP- 
SIS. The main task of pairwise comparison is to rank each item’s relative 
importance to other items individually, building a pairwise comparison matrix. 
We have used Saaty’s ranking system of 1 (equally important) to 9 (extremely 
more important) to assess relative importance. (See Table 8.10, pg. 362.) The 
reverse relation uses reciprocals. For instance, if A is 3 times more important 
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than B, then B must be 1/3 as important as A. Using reciprocals makes the 
pairwise comparison matrix a positive reciprocal matrix which must have a 
dominant eigenvalue with an associated eigenvector. That eigenvector, when 
normalized, gives a normalized set of weights. The eigenvalue helps determine 
the consistency of the decision maker’s ratings. Further analysis of methods 
and applications of consistency measurement appears in [Temesi2006]. 

Suppose we have the following pairwise comparison matrix from Exam- 
ple 8.10 with the criteria listed in rank order. 





1 5 5/2 8 
1/5 1 1/2 1 
> PCM := / / 
2/5 2 1 2 
i 1/8 1 1/2 1 
|> AnalyzeComparisonMat( PCM); 
L [| 0.5927 0.1046 0.2092 0.09356 | ,0.01037] 


Note: The eigenvector can be quickly approximated by taking row-sums of 
PCM and dividing by the sum of all elements. This quick technique gives the 
criteria weights of [0.60, 0.10, 0.20, 0.10] for PCM—-very close to the eigenvec- 
tor weights. Note that quite often different factors will have the same, or very 
close, weights. 


The Entropy Method 


In 1947, Shannon and Weaver!! developed the entropy method for weighting 
criteria. Entropy, in the context of weightings, is the probabilistic measure of 
uncertainty in the information. A broad distribution represents more uncer- 
tainty in specific values than a sharply peaked distribution. To see this con- 
cept, compare the graphs of two normal distributions with mean 0, one having 
a small standard deviation and the other a large. The entropy method com- 
putations can be decomposed into simple steps. Begin with a matrix Mmxn 
ranking the alternatives. 


STEP 1. Calculate R = [r;;] which is M normalized by dividing each column 
by its column-sum. 


STEP 2. Compute the degree of divergence vector e = [e;| for j = 1..m with 


n 


—1 
Ej = in(n) i X ry ln(r;;). 


i=1 





See C. Shannon and W. Weaver, “The Mathematical Theory of Communication,” 
Univ. Illinois Press, 1949. 
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STEP 3. The normalized weights w = [w;] are then given by 


The degree of divergence vector d = [d;] = 1 — e indicates the average amount 
of information contained by each attribute with higher being more. 

The book’s PSMv2 package contains Entropy Weights to calculate weights 
from a matrix via the entropy method and Normalize to normalize a matrix 
via column-sums. As an example, use the AltM preference matrix from the 
Car Selection example (pg. 377) above. 


27.8 94 3 7.5 44 40 8.7 


28.5 96 4 84 47 47 8.1 

38.7 96 3 82 35 40 6.3 
> AItM := 

25.5 94 5 7.8 43 39 7.5 

27.5 96 5 76 36 40 8.3 

36.2 9.4 3 8.1 36.2 40 8.0 


> Entropy Weights( AltM); 
[ 0.2300 0.0010 0.4995 0.0155 0.1227 0.0389 0.0924 | 


Compare these weights with the other methods’ values. These weights can be 
used in our methods, SAW, AHP, or TOPSIS, in lieu of the pairwise compar- 
ison weight method. 





Comparison 


Table 8.11 shows the weights each method produces for the Agency Project 
of Example 8.10. 


TABLE 8.11: Comparing Weight Generating Methods 





Method Schedule Project Cost Agency Control Competition 
ROC 0.52 0.272 0.12 0.06 
Ratio 0.417 0.333 0.167 0.083 

Pairwise Comp 0.593 0.105 0.209 0.0936 

Entropy 0.2492 0.203 0.203 0.345 
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Exercises 


Repeat the exercises from Sections 8.3, 8.4, and 8.5. In each case, compare 
your new results to the results found using the pairwise comparison method. 


1. Use the rank order centroid method to generate weights. 
2. Use the ratio method to generate weights. 


3. Use the entropy method to generate weights. 


PE —=E ee 


Projects 


Project 1. Prepare a presentation on “State of Art Surveys of Overviews on 
MCDM/MADM Methods” by Zavadskas, Turskis, and Kildiené in Technolog- 
ical And Economic Development Of Economy, 2014 Vol. 20(1), 165-179. 


Project 2. Mary Burton (www.maryburton.com) has discovered that she is 
being stalked by a deranged editor after switching her successful book, The 
Last Move, to a new publisher. She has decided to quietly relocate to one of 
the Windward Islands of the Caribbean. Her main criteria are: 


e Unemployment rate (as a proxy for crime rate) 


Military expenditures (as a proxy for security) 


Airport capacity 


GDP per capita 


Population growth rate 


Choose a preference order for her criteria; explain your choices. Consult the 
CIA’s World Factbook “Country Comparisons”!* to compile data on each 
island. Select the best island using SAW, then re-select using AHP and TOP- 
SIS; use the different weight selection methods with each technique. Write a 
report to Ms. Burton describing your preference choices and the islands (with 
their preference ordering) you recommend. After sending your report and ver- 
ifying receipt, destroy all of your work so that the unhinged editor cannot use 
your recommendations to track her. 


Project 3. Each year, the US News and World Report produces a ranking 
of the “Best States” in the US.'* Load the 2019 data from the book’s PSMv2 





12 Available at https: //www.cia.gov/library /publications/resources/the-world-factbook/. 
13 Available at https: //www.usnews.com/news/best-states/rankings. 
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package by executing 


[> with(PSMv2) : 

Describe( GetStateRankings); 

TheRankings := GetStateRankings(); 
Examine the criteria used and create your own preference ordering. Use each 
of the methods of generating weights and each of the MADM methods to rank 
the 50 states. Prepare a standard conference 4’ x 3’ poster to describe your 
results and work. 
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